Asymptotic variance of random symmetric digital search 

trees 



Hsien-Kuei Hwang Michael Fuchs 

Institute of Statistical Science Department of Applied Mathematics 

Academia Sinica National Chiao Tung University 

Taipei 115 Hsinchu 300 

Taiwan Taiwan 

Vytas Zacharovas 
Institute of Statistical Science 
Academia Sinica 
Taipei 115 
Taiwan 

March 4, 2010 



Abstract 

Asymptotics of the variances of many cost measures in random digital search trees are often no- 
toriously messy and involved to obtain. A new approach is proposed to facilitate such an analysis for 
several shape parameters on random symmetric digital search trees. Our approach starts from a more 
careful normalization at the level of Poisson generating functions, which then provides an asymptoti- 
cally equivalent approximation to the variance in question. Several new ingredients are also introduced 
such as a combined use of the Laplace and Mellin transforms and a simple, mechanical technique for 
justifying the analytic de-Poissonization procedures involved. The methodology we develop can be 
easily adapted to many other problems with an underlying binomial distribution. In particular, the less 
expected and somewhat surprising ra(log n) 2 -variance for certain notions of total path-length is also 
clarified. 

Key words: Digital search trees, Poisson generating functions, Poissonization, Laplace transform, Mellin 
transform, saddle-point method, Colless index, weighted path-length 

Dedicated to the 60th birthday of Philippe Flajolet 



1 Introduction 

The variance of a distribution provides an important measure of dispersion of the distribution and plays 
a crucial and, in many cases, a determinantal role in the limit law 1 . Thus finding more effective means 

'The first formal use of the term "variance" in its statistical sense is generally attributed to R. A. Fisher in his 1918 paper 
(see [20] or Wikipedia's webpage on variance), although its practical use in diverse scientific disciplines predated this by a few 
centuries (including closely-defined terms such as mean-squared errors and standard deviations). 
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of computing the variance is often of considerable significance in theory and in practice. However, the 
calculation of the variance can be computationally or intrinsically difficult, either because of the messy 
procedures or cancellations involved, or because the dependence structure is too strong or simply because 
no simple manageable forms or reductions are available. We are concerned in this paper with random 
digital trees for which asymptotic approximations to the variance are often marked by heavy calculations 
and long, messy expressions. This paper proposes a general approach to simplify not only the analysis but 
also the resulting expressions, providing new insight into the methodology; furthermore, it is applicable to 
many other concrete situations and leads readily to discover several new results, shedding new light on the 
stochastic behaviors of the random splitting structures. 

A binomial splitting process. The analysis of many splitting procedures in computer algorithms leads 
naturally to a structural decomposition (in terms of the cardinalities) of the form 




where B n is essentially a binomial distribution (up to truncation or small perturbations) and the sum of 
B n + B n is essentially n. 

Concrete examples in the literature include (see the books [15, 28, 44, 50, 62] and below for more 
detailed references) 

• tries, contention-resolution tree algorithms, initialization problem in distributed networks, and radix 
sort: B n = Binomial(n; p) and£? n = n—B n , namely, F(B n = k) = (^)p k q n ~ k (here and throughout 
this paper, q := 1 — p); 

• bucket digital search trees (DSTs), directed diffusion-limited aggregation on Bethe lattice, and Eden 
model: B n = Binomial(n — b; p) and B n = n — b — B n ; 

• Patricia tries and suffix trees: F(B n = k) = \T)p k q n ~ k / (1 — p n — q n ) and B n = n — B n . 
Yet another general form arises in the analysis of multi-access broadcast channel where 

J B n = Binomial (n ; p) + Poisson(A), 
1 B n = n — Binomial (n ; p) + Poisson(A), 

see [19, 33]. For some other variants, see [2, 6, 25]. One reason of such a ubiquity of binomial distribution 
is simply due to the binary outcomes (either zero or one, either on or off, either positive or negative, etc.) of 
many practical situations, resulting in the natural adaptation of the Bernoulli distribution in the modeling. 
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Poisson generating function and the Poisson heuristic. A very useful, standard tool for the analysis of 
these binomial splitting processes is the Poisson generating function 



f(z) = e~ z £ 



fc>0 



a k k 
7 

k\ ' 



where {a^} is a given sequence, one distinctive feature being the Poisson heuristic, which predicts that 



Ifa n is smooth enough, then a n ~ f{n). 



In more precise words, if the sequence {ak} does not grow too fast (usually at most of polynomial growth) 
or does not fluctuate too violently, then a n is well approximated by f{n) for large n. For example, if 
f(z) = z m , m = 0, 1, . . . , then a n ~ n m ; indeed, in such a simple case, a n = n(n — 1) • • • (n — m + 1). 

Note that the Poisson heuristic is itself a Tauberian theorem for the Borel mean in essence; an Abelian 
type theorem can be found in Ramanujan's Notebooks (see [3, p. 58]). 

From an elementary viewpoint, such a heuristic is based on the local limit theorem of the Poisson 
distribution (or essentially Stirling's formula for n!) 

Tre"" ~ -== 1 + - .= + • • • (k = n + xy/n), 

kl y/2^H V 6 V™ / 

whenever x = o(n 1 / 6 ). Since a n is smooth, we then expect that 

-x 2 /2 poo -x 2 /2 



f(n) « V] a k — « a n / dx = a 

, ^ ^ V27m J-oo V27T 

z=0(n e ) 

On the other hand, by Cauchy's integral representation, we also have 



n! 

~~ 27ii 


/|*|=n 


- n - l e z ~f{z) dz 


« /(n) 


^/ 


z~ n - l e z dz 


= /(n) 







since the saddle-point z = n of the factor z n e z is unaltered by the comparatively more smooth function 



The Poisson-Charlier expansion. The latter analytic viewpoint provides an additional advantage of 
obtaining an expansion by using the Taylor expansion of / at z = n, yielding 

an = Y^tM Tj {n\ (1) 



where 



r J {n):=nl[^](z-ny^= £ (A(-iy-*J^- (j = 0,l,...), 

o<i<j ^ ' ^ '' 



3 



and [z n ]0(z) denotes the coefficient of z n in the Taylor expansion of (f>(z). We call such an expansion the 
Poisson-Charlier expansion since the r/s are essentially the Charlier polynomials Cj(X, n) defined by 

Cj(X,n) := \- n n\[z n ](z - l) j e Xz , 

so that Tj(n) = n^Cj(n, n). For other terms used in the literature, see [28, 29]. 
The first few terms of Tj(n) are given as follows. 



ro(n) 


7j (n) 


r 2 (n) 


T 3(n) 


r 4 (n) 


75 (n) 
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— n 


2n 


3n(n - 2) 


— 4n(5n — 6) 


-5n(3n 2 - 26n + 24) 



It is easily seen that Tj(n) is a polynomial in n of degree [j/2\ . 

The meaning of such a Poisson-Charlier expansion becomes readily clear by the following simple but 
extremely useful lemma. 

Lemma 1.1. Let f(z) :— e~ z J2k>o a k zk If f is an entire function, then the Poisson-Charlier expan- 
sion (1) provides an identity for a n . 

Proof. Since / is entire, we have 

££v = «■/(,) = e -E^V "J*, 

n>0 ' j>0 J ' 

and the lemma follows by absolute convergence. □ 

Two specific examples are worthy of mention here as they speak volume of the difference between 
identity and asymptotic equivalence. Take first a n = (—1)™. Then the Poisson heuristic fails since (—1)™ ^ 
e~ 2n , but, by Lemma 1.1, we have the identity 

(-!)» = c -*»£^7j(n). 
i>o J ' 

See Figure 1 for a plot of the convergence of the series to (— l) n . 




80 100 



Figure 1: Convergence of e 2n J2j<k(~ 2) j Tj(n)/j\ to (— l) n for n = 10 (left) and n = 11 (right) for 
increasing k. 

Now if a n = 2 n , then 2 n ^ e n , but we still have 



2 n = e n J2 



j>o J 



So when is the Poisson-Charlier expansion also an asymptotic expansion for a n , in the sense that drop- 
ping all terms with j > 2d, introduces an error of order f^ 2t> n l (which in typical cases is of order j(n)n~^)? 
Many sufficient conditions are thoroughly discussed in [36], although the terms in their expansions are ex- 
pressed differently; see also [62]. 



Poissonized mean and variance. The majority of random variables analyzed in the algorithmic literature 
are at most of polynomial or sub-exponential (such as e c(logn)2 or e cn ) orders, and are smooth enough. 
Thus the Poisson generating functions of the moments are often entire functions. The use of the Poisson- 
Charlier expansion is then straightforward, and in many situations it remains to justify the asymptotic 
nature of the expansion. 

For convenience of discussion, let f m (z) denote the Poisson generating function of the m-th moment 
of the random variable in question, say X n . Then by Lemma 1.1, we have the identity 

/i (i) (n) 



and for the second moment 



j>0 J ' 



E(^) = £^T,(n), (2) 



J- 



provided only that the two Poisson generating functions f\ and f 2 are entire functions. 
These identities suggest that a good approximation to the variance of X n be given by 



Y(X n ) = E(X 2 n ) - (E(X n )) 2 « f 2 (n) - A(n) 



which holds true for many cost measures, where we can indeed replace the imprecise, approximately 
equal symbol "~" by the more precise, asymptotically equivalent symbol However, for a large class 
of problems for which the variance is essentially linear, meaning roughly that 

to ™-> = 1, (3) 

n-Kx) log n 

the Poissonized variance fain) — fi( n ) 2 is n °t asymptotically equivalent to the variance. This is the case 
for the total cost of constructing random digital search trees, for example. One technical reason is that 
there are additional cancellations produced by dominant terms. The next question is then: can we find a 
better normalized function so that the variance is asymptotically equivalent to its value at n? 



Poissonized variance with correction. The crucial step of our approach that is needed when the variance 
is essentially linear is to consider 

V{z):=Uz)-h{z?-zf[{z)\ (4) 

and it then turns out that 

V{X n ) = V{n) + 0{{logn) c ), 

in all cases we consider for some c > 0. The asymptotics of the variance is then reduced to that of V(z) 
for large z, which satisfies, up to non-homogeneous terms, the same type of equation as fi(z). Thus the 
same tools used for analyzing the mean can be applied to V(z). 
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To see how the last correction term zf[(z) 2 appears, we write D(z) := fi{z) — fi( z ) 2 , s ° that fi{z) 
D(z) + fi(z) 2 , and we obtain, by substituting this into (2), 



Y(X n ) = E(X 2 n ) - (E(X n )) 

h in) I ^ f\ [n) 



— ^j— r S {n) - ^ -1 r iH 
i>o J ' \j>o J ' 

= D[n) — nf[{n) 2 — —D"{n) + smaller-order terms. 

Now take /i(n) x nlogn. Then the first term following D[n) is generally not smaller than D[n) 
because 

™/iH 2 x ™(logn) 2 , 

while D[n) x n(logn) 2 , at least for the examples we discuss in this paper. Note that the variance is in 
such a case either of order n log n or of order n. Thus to get an asymptotically equivalent approximation 
to the variance, we need at least an additional correction term, which is exactly nf[[n) 2 . 

The correction term nf[(n) 2 already appeared in many early papers by Jacquet and Regnier (see [34]). 

A viewpoint from the asymptotics of the characteristic function. Most binomial recurrences of the 
form 

X n = X Bn + X$ n + T n , (5) 

as arising from the binomial splitting processes discussed above are asymptotically normally distributed, 
a property partly ascribable to the highly regular behavior of the binomial distribution. Here the (X*) are 
independent copies of the (X n ) and the random or deterministic non-homogeneous part T n is often called 
the "toll-function," measuring the cost used to "conquer" the two subproblems. Such recurrences have 
been extensively studied in numerous papers; see [36, 52, 58, 59] and the references therein. 

The correction term we introduced in (4) for Poissonized variance also appears naturally in the follow- 
ing heuristic, formal analysis, which can be justified when more properties are available. By definition and 
formal expansion 



Dlz) 



n\ £ — ' ml 

n>0 m>0 



oxi > ( /jU)^ - —-77— ^ 2 
where D(z) := /^(z) — fi( z ) 2 > we have 

E (e^-~^ ie ) ^^-^ * _n_1 exp (z + ( h{z) - /i(n)) i6 - ^9 2 + ■ ■ ■ j dz. 
Observe that with z = ne lt , we have the local expansion 

nc* - nit + (A(ne u ) - / x (n)) iQ - £^l d 2 = n - ^ - nf[(n)t9 - ^6 2 + ■ ■ • , 
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for small t. It follows that 




by extending the integral to ±00 and by completing the square. This again shows that nf[(n) 2 is the right 
correction term for the variance. For more precise analysis of this type, see [36]. 

A comparison of different approaches to the asymptotic variance. What are the advantages of the 
Poissonized variance with correction? In the literature, a few different approaches have been adopted for 
computing the asymptotics of the variance of the binomial splitting processes. 

• Second moment approach: this is the most straightforward means and consists of first deriving 
asymptotic expansions of sufficient length for the expected value and for the second moment, then 
considering the difference E(X^) — (E(X n )) 2 , and identifying the lead terms after cancellations of 
dominant terms in both expansions. This approach is often computationally heavy as many terms 
have to be cancelled; additional complication arises from fluctuating terms, rendering the resulting 
expressions more messy. See below for more references. 

• Poissonized variance: the asymptotics of the variance is carried out through that of D{n) = /2(n) — 
fi(n) 2 . The difference between this approach and the previous one is that no asymptotics of /2(n) is 
derived or needed, and one always focuses directly on considering the equation (functional or differ- 
ential) satisfied by D(z). As we discussed above, this does not give in many cases an asymptotically 
equivalent estimate for the variance, because additional cancellations have to be further taken into 
account; see for instance [34, 35, 36]. 

• Characteristic function approach: similar to the formal calculations we carried out above, this ap- 
proach tries to derive a more precise asymptotic approximation to the characteristic function using, 
say complex-analytic tools, and then to identify the right normalizing term as the variance; see the 
survey [36] and the papers cited there. 

• Schachinger's differencing approach: a delicate, mostly elementary approach based on the recur- 
rence satisfied by the variance was proposed in [58] (see also [59]). His approach is applicable to 
very general "toll-functions" T n in (5) but at the price of less precise expressions. 

The approach we use is similar to the Poissonized variance one but the difference is that the passage 
through D(z) is completely avoided and we focus directly on equations satisfied by V(z) (defined in (4)). 

In contrast to Schachinger's approach, our approach, after starting from defining V(z), is mostly ana- 
lytic. It yields then more precise expansions, but more properties of T n have to be known. The contrast here 
between elementary and analytic approaches is thus typical; see, for example, [7, 8]. See also Appendix 
for a brief sketch of the asymptotic linearity of the variance by elementary arguments. 

Additional advantages that our approach offer include comparatively simpler forms for the resulting 
expressions, including Fourier series expansions, and general applicability (coupling with the introduction 
of several new techniques). 
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Organization of this paper. This paper is organized as follows. We start with the variance of the total 
path-length of random digital search trees in the next section, which was our motivating example. We 
then extend the consideration to bucket DSTs for which two different notions of total path-length are 
distinguished, which result in very different asymptotic behaviors. The application of our approach to 
several other shape parameters are discussed in Section 4. Table 1 summarizes the diverse behaviors 
exhibited by the means and the variances of the shape parameters we consider in this paper. 



Shape parameters 


mean 


variance 


Internal PL 


n logn 


n 


Key-wise PL* 


n logn 


n 


Node-wise PL* 


n logn 


n (logn) 2 


Peripheral PL 


n 


n 


^(leaves) 


n 


n 


Differential PL 


n 


nlogn 


Weighted PL 


n(logn) m+1 


n 



Table 1: Orders of the means and the variances of all shape parameters in this paper; those marked with 
an * are for b-DSTs with b > 2. Here PL denotes path-length and m > 0. 

Applications of the approach we develop here to other classes of trees and structures, including tries, 
Patricia tries, bucket sort, contention resolution algorithms, etc., will be investigated in a future paper. 

2 Digital Search Trees 

We start in this section with a brief description of digital search trees (DSTs), list major shape parameters 
studied in the literature, and then focus on the total path-length. The approach we develop is also very 
useful for other linear shape measures, which is discussed in a more systematic form in the following 
sections. 

2.1 DSTs 

DSTs were first introduced by Coffman and Eve in [9] in the early 1970's under the name of sequence hash 
trees. They can be regarded as the bit-version of binary search trees (thus the name); see [44, p. 496 et 
seq.]. Given a sequence of binary strings, we place the first in the root node; those starting with "0" ("1") 
are directed to the left (right) subtree of the root, and are constructed recursively by the same procedure 
but with the removal of their first bits when comparisons are made. See Figure 2 for an illustration. 

While the practical usefulness of digital search trees is limited, they represent one of the simplest, 
fundamental, prototype models for divide- and-conquer algorithms using coin-tossing or similar random 
devices. Of notable interest is its close connection to the analysis of Lempel-Ziv compression scheme that 
has found widespread incorporation into numerous softwares. Furthermore, the mathematical analysis is 
often challenging and leads to intriguing phenomena. Also the splitting mechanism of DSTs appeared 
naturally in a few problems in other areas; some of these are mentioned in the last section. 

Random digital search trees. The simplest random model we discuss in this paper is the independent, 
Bernoulli model. In this model, we are given a sequence of n independent and identically distributed 
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010111 
101011 
100001 
011011 
111110 
110111 
010011 
011110 
000100 



010111 



(oiioii) 



(lOlOlj) 



(oooiog) (oiooii) (looooj) (iiiiio) 



(oinio) (llQlll) 



Figure 2: A digital search tree of nine binary strings. 



random variables, each comprising an infinity sequence of Bernoulli random variables with mean p, < 
p < 1. The DST constructed from the given random sequence of binary strings is called a random DST. If 
p = 1/2, the DST is said to be symmetric; otherwise, it is asymmetric. We focus on symmetric DSTs in 
this paper for simplicity; extension to asymmetric DSTs is possible but much harder. 

Stochastic properties of many shape characteristics of random DSTs are known. Almost all of them 
fall into one of the two categories, according to their growth order being logarithmic or essentially linear 
(in the sense of (3)), which we simply refer to as "log shape measures" and "linear shape measures". 



Log shape measures. The two major parameters studied in this category are depth, which is the distance 
of the root to a randomly chosen node in the tree (each with the same probability), and height, which 
counts the number of nodes from the root to one of the longest paths. Both are of logarithmic order in 
mean. Depth provides a good indication of the typical cost needed when inserting a new key in the tree, 
while height measures the worst possible cost that may be needed. 

Depth was first studied in [45] in connection with the profile, which is the sequence of numbers, each 
enumerating the number of nodes with the same distance to the root. For example, the tree -=:==;=:-s; has 
the profile {1, 2, 3, 2, 3}. For other papers on the depth of random DSTs, see [11, 12, 13, 37, 38, 39, 44, 
46, 47, 50, 55, 60, 61]. The height of random DSTs is addressed in [13, 14, 43, 50, 55]. 



Linear shape measures. These include the total internal path-length, which sums the distance between 
the root and every node, and the occurrences of a given pattern (leaves or nodes satisfying certain proper- 
ties); see [24, 26, 30, 31, 35, 40, 42, 44]. 

The profile contains generally much more information than most other shape measures, and it can to 
some extent be regarded as a good bridge connecting log and linear measures; see [15, 17, 45, 46] for 
known properties concerning expected profile of random DSTs. 

Nodes of random DSTs with p = 1 /2 are distributed in an extremely regular way, as shown in Figures 3 
and 4. 



2.2 Known and new results for the total internal path-length 

Throughout this section, we focus on X n , the total path length of a random digital search tree built from n 
binary strings. By definition and by our random assumption, X n can be computed recursively by 

X n+1 ±X Bn +X* n _ B +n, (n>0) (6) 
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with the initial condition X = 0, since removing the root results in a decrease of n for the total path 

length (each internal node below the root contributes 1). Here B n ~ Binomial(n; 1/2), X n = X*, and 
X n , X*, B n are independent. 
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Figure 3: Two typical random DSTs. 
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n = 1000 



Figure 4: Two random DSTs of 1000 nodes rendered differently. For more graphical renderings of random 
DSTs, see the first author's webpage algo . stat . sinica . edu . tw . 
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Known results. It is known that (see [26, 30, 57]) 



1 

E(X n ) = (n + 1) log 2 n + n - — — + - — c\ + ?i7i(log 2 n 



>og 2 2 7 
7 — 1/25 

+ lQg2 + - - ci + zu 2 (\og 2 n) + O (n -1 logn) , 

where 7 denotes Euler's constant, c\ := J2k>i(^ k ~~ an d ^i(t), tz7 2 (£) are 1-periodic functions with 
zero mean whose Fourier expansions are given by (xk '■= 2kiri/L, L := log 2) 



^) = -xE( 1 -f) r (-»)^ 



respectively. Here T denotes the Gamma function. Thus we see roughly that random digital search trees 
under the unbiased Bernoulli model are highly balanced in shape. An important feature of the periodic 
functions is that they are marked by very small amplitudes of fluctuation: |wi(£)| < 3.4 x 10~ 8 and 
1^2(^)1 < 3.4 x 1CT 6 . Such a quasi-flat (or smooth) behavior may in practice be very likely to lead to 
wrong conclusions as they are hardly visible from simulations of moderate sample sizes. 




Figure 5: A plot ofE,(X n )/(n + 1) — log 2 n in log-scale (the decreasing curve using the y-axis on the 
right-hand side), and that ofY(X n ) / n in log-scale ( the increasing curve using the y-axis on the left-hand 
side). 

Let 

g* := n C 1 -^)' and == n ( x - • (9) 

In particular, Q(l) = Qoo- The variance was computed in [42] by a direct second-moment approach and 
the result is 

V(X n ) = n(C kps + w kps (\og 2 n)) + 0(log 2 n), 



13 



where Wk ps {t) is again a 1-periodic, zero-mean function and the mean value Ck ps is given by (L := log 2) 

28 39 7T 2 2 2Q OQ ^ £2 e 2 ^ 1 



Ckps - 3L 4 + 2L 2 + L 2 



£> 



£>1 



(£-5) 



L^(l + I)l(l-1){#-1) 



+ f E(- 1 )' 2 " m ( L{1 XT-/ 2 ~ 1 - E 

£>1 \ r>2 

+ 1 \ Qr-2Ql-r 



r(r - l)(2 r+£ - 1) 



+EE 

£>3 2<r<£ 



^>2 ^ r>0 



E 



-E 



j>£+l 



2i 



■-> I [1] [2] 

2 roi ro^ J 



W\2 



-iy2~(t 1 ) 







r+£-2 



X 



2+E 



2<i<£ 



1 



2 r+i-l 



+ 



1 



+ 



+ 1 



1 



1 _ 2 -*-r)2 ( X _ 2 l-«-r)2 L(l - 2 



l-£-7 



^ 1 ) J 2^+J- 1 - 1 + L ^ V J / 2 r+j - 1 



+ 



iE 

0<j<£+l 



+ 1 



E 



J / St (i + l)(2'+i + <- 1) 



Here [tc^cc^o denotes the mean value of the function wi(t)w (t) over the unit interval. The long expres- 
sion obviously shows the complexity of the asymptotic problem. 

We show that this long expression can be largely simplified. Before stating our result, we mention 
that the asymptotic normality of X n (in the sense of convergence in distribution) was first proved in [35] 
by a complex-analytic approach; for other approaches, see [59] (martingale difference), [31] (method of 
moments), [52] (contraction method). 



A new asymptotic approximation to W(X n ). Define 



where for < ^ft(uj) < 3 and x > 



which, by the relation 



ds = — - — r = r(w)r(i - u) (o < < i), 



o s + 1 sin(7ro; 
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can be represented as 

tt(1+£ w - 2 ((u;-2)£ + 1-w) 



i ,. ,) - J (x - l) 2 sin(7ru;) ' lf x ^ X ' 

(p[WlX) -\^ u-l)(u-2) . fx = l 

2 sin(7rw) 

The last expression provides indeed a meromorphic continuation of (p(u; x) into the whole complex em- 
plane whenever x > 0. In particular, 

{x-logar-1 
(x-1) 2 
5 , if* = l- 

Theorem 2.1. 77ie variance of the total path-length of random. DSTs ofn nodes satisfies 

V(X n ) = n(C kps + zu kps (\og 2 n)) + 0(1), (11) 

where 

^ - log 2 " log 2 Q.Q.Q^ ^ " » 



and /las tfie Fourier series expansion 



_ .... , v ^2(2 + Xfe) 2 Jbrit 



log 2 ^ T(2 + Xfc) 
ta fcez\{o} v x 



which is absolutely convergent. 



One can derive more precise asymptotic expansions for V(X n ) by the same approach we use. We 
content ourselves with (1 1) for convenience of presentation. 
Note that 

G2(2 + Xk ) ( . y (-l)^-(^ 1 ) h 



where 

l-tx»(l+Xk(l-t)) 



if Ml; 



A fc (t):= , (I"*) 2 
2 

Thus the Fourier series is absolutely convergent by the order estimate (see [18]) 

\T(c + it)\ = O (\t\ c - 1/2 e- nm ) (|t|^oo). (12) 

Numerically, C kps « 0.26600 36454 05936 . . . , in accordance with that given in [42]. Also \w kps (t) \ < 
1.9 x 10~ 5 . 
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Sketch of our approach. Following the discussions in Introduction, we first prove that the Poisson- 
Charlier expansion for the mean and that for the second moment are not only identities but also asymptotic 
expansions. For that purpose, it proves very useful to introduce the following notion, which we term 75- 
admissible functions (following the survey paper [35]). This is reminiscent of the classical H-admissible 
(due to Hayman) or HS-admissible (due to Harris-Schoenfeld) functions; see [28, §VHL5]. 

Once we prove the asymptotic nature of the Poisson-Charlier expansions for the mean and the second 
moment, it remains, according again to the discussions in Introduction, to derive more precise asymptotics 
for the function V (as defined in (4)), for which we will use first the Laplace transforms, normalize the 
Laplace transform properly, and then apply the Mellin transform. Such an approach will turn out to be very 
effective and readily applicable to more general cases such as bucket DSTs, which is discussed in details 
in the next section. The approach parallels closely in essence that introduced by Flajolet and Richmond 
in [24], which starts from the ordinary generating function, followed by an Euler transform, a proper 
normalization and the Mellin transform, and then conclude by singularity analysis; see also [10]. The 
path we take, however, offers additional operational advantages, as will be clear later. See Figure 7 for a 
diagrammatic illustration of the two analytic approaches. 

2.3 Analytic de-Poissonization and JS-admissibility 

The fundamental differential-functional equations for the analysis of random DSTs is of the form 



with suitably given initial value /(0) and g. For such functions, it turns out that the asymptotic nature 
of the Poisson-Charlier expansions for the coefficients (or de-Poissonization) can be justified in a rather 
systematic way by the introduction of the notion of JS-admissible functions. 

Here and throughout this paper, the generic symbol e G (0, 1) always represents an arbitrarily small 
constant whose value is immaterial and may differ from one occurrence to another. 

Definition 1. An entire function f is said to be JS-admissible, denoted by f G ^,5^ \ if the following two 
conditions hold for \z\ > 1. 

(I) There exist a, (3 G M such that uniformly for \ dig(z) \ < e, 



For convenience, we also write / G ^X^L/? to indicate the growth order of / inside the sector 

| arg(^)| < e. 

Note that if / satisfies condition (I), then, by Cauchy's integral representation for derivatives (or by 
PJtt's theorem; see [54, Ch. 1, § 4.3]), we have, 



f(z)+f'(z) = 2f(z/2) + ~g(z) 



f(z)=0{\z\ a (\og + \z\f), 



where log + x := log(l + x). 
(O) Uniformly for e < | arg(z)| < n, 



f(z) ■= e z f{z) = O (e^l 2 l) . 




0(\zr k (\og + \z\f). 
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Proposition 2.2. Assume f G ^^ a ^ Let f(z) := e z f(z). Then the Poisson-Charlier expansion (1) of 
f( n \0) is also an asymptotic expansion in the sense that 



a n := / (n) (0) = n\[z n )f(z) = n\[z n ]e z f(z) 



^P-r^ + ofn^ (log n) 



0<j<2k 

fork = l,2,.... 

Proof. (Sketch) Starting from Cauchy's integral formula for the coefficients, the lemma follows from a 
standard application of the saddle-point method. Roughly, condition (O) guarantees that the integral over 
the circle with radius n and argument satisfying e < \ aig(z) \ < n is negligible, while condition (I) implies 
smooth estimates for all derivatives (and thus error terms). □ 

The polynomial growth of condition (I) is sufficient for all our uses; see [36] for more general versions. 
The real advantage of introducing admissibility is that it opens the possibility of developing closure 
properties as we now discuss. 

Lemma 2.3. Let m be a nonnegative integer and a G (0, 1). 

(i) z m , e~ az G fy. 

(ii) Iff G JST, then f(az), z m f G 
(Hi) Iff, 9 e then f + ge 

(iv) If f G then the product Pf G , where P is a polynomial of z. 

(v) Iff,g G jy, then h G Jf, where h(z) := f{az)g{{l - a)z). 

(vi) Iff G jy, then f G and thus ] {m ^ G 

Proof. Straightforward and omitted. □ 

Specific to our need for the analysis of DSTs is the following transfer principle. 
Proposition 2.4. Let f{z) and g(z) be entire functions satisfying 

f(z) + f(z)=2f(z/2)+g(z), (13) 

with /(0) = 0. Then 

gejy if and only if f G jy. 

Proof. Assume g G J?y '. We check first the condition (O) for /. Let f(z) := e z f(z) and g(z) := e z g(z). 
By (13), 

f(z) = 2e z / 2 f(z/2)+g(z). 

Consequently, since f(0) = 0, 



\ Z {2e t ' 2 f{t/2) + g(t)) dt = z f (2e tz ' 2 f(tz/2) + g(tz)) dt. (14) 
Jo Jo 
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Now define 



B(r) := max \f(z)\, 



where 

C r , e := {z : \z\ < r,e < | arg(z)| < vr}, (r > 0; < s < tt/2). 
Then, by (14), we have 

B(r) < r I (2e trcos{£)/2 B(tr/2) + \g{tr)\) dt 
Jo 

(2e tcos{£)/2 B(t/2) + O (e (1 - £) *)) dt 

< Ce rcos{£)/2 B{r/2) + O (e {1 ' £)r ) , 

where C = 4/ cose > 1. This suggests that we define a majorant function K[r) of £>(r) by fT(r) = 0(1) 
for r < 1 and for r > 1 

K(r) = Ce rcos(e)/2 K(r/2) + h(r), 

where h is an entire function satisfying h[r) = 0(1) for r < 1 and h[r) = O (e( 1-£ ) r ) for r > 1. Let 
.fir(r) := e" rcos(e) i^(r) and /i(r) := e" rcos(e) /i(r). Then since cose - 1 + e > for e E (0, 1), we obtain 

k(r) = CK(r/2)+h(r), h(r) = 0(l). 

Thus if we choose m = |~log 2 r] such that 2 m > r and iterate m times the functional equation, then we 
obtain the estimate 

K(r)= C k h(r/2 k ) + C m+1 k(r/2 m+1 ) 

0<k<m 



Thus 



\r/2 k >l 

O (r log2C ) . 
B{r) = (r lo ^ c e rcos£ ) . 



which establishes condition (O). 

Our proof for / satisfying (I) proceeds in a similar manner and starts again from (14) but of the form 

f(z) = z I e-^ z (2f(tz/2)+g(tz)) dt. 



Now, define 



B(r) : = max \f(z)\, 



where 

S r , £ ■= {z : \z\ < r, | arg(z)| < e}, (r > 0; < e < tt/2). 
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Then 



B(r) < r J e -a-t)rcose (2B(tr/2) + \g(tr)f) dt 

/T 
(2e- {r - t)cose B(t/2) + (e-( r - t)cose t a (\og + tf)) dt + 0(l) 

< CB(r/2) + O (r Q (log + rf + l) , 

where (7 = 2/ cose > 2. The same majorization argument used above for (O) then leads to 

(O(r lo ^ c ), ifa<log 2 C; 

B(r) = < O(r lo ^ c (\og + rf +1 ), if a = hg 2 C; 

{ O (r a (\og + rf) , if a > log 2 C. 

This proves (I) for /. 

The necessity part follows trivially from Lemma 2.3. □ 

The estimates we derived of asymptotic-transfer type are indeed over-pessimistic when 1 < a < 
log 2 C, but they are sufficient for our use. The true orders are those with e — > 0, which can be proved by 
the Laplace-Mellin-de-Poissonization approach we use later. 

Lemma 2.3 and Proposition 2.4 provide very effective tools for justifying the de-Poissonization of 
functions satisfying the equation (13), which is often carried out through the use of the increasing-domain 
argument (see [36]). The latter argument is also inductive in nature and similar to the one we are developing 
here, although it is less "mechanical" and less systematic. 



2.4 Generating functions and integral transforms 

Since our approach is purely analytic and relies heavily on generating functions, we first derive in this 
subsection the differential-functional equations we will be working on later. Then we apply the de- 
Poissonization tools we developed to the Poisson generating functions of the mean and the second moment 
and justify the asymptotic nature of the corresponding Poisson-Charlier expansions. Then we sketch the 
asymptotic tools we will follow based on the Laplace and Mellin transforms. 



Generating functions. In terms of the moment generating function M n (y) := K(e Xny ), the recurrence 
(6) translates into 

M n+1 (y) = e nv 2- n (") M,{y)M n „ 3 {y), (n > 0), (15) 

withMo(y) = 1. 

Now consider the bivariate exponential generating function 

n>0 

Then by (15), 

d (e y z 
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and the Poisson generating function F(z, y) := e z F(z, y) satisfies the differential-functional equation 

F(z, y) + y) = e^'^F y^j \ (16) 

with F(0, y) — 1. No exact solution of such a nonlinear differential equation is available; see [35] for an 
asymptotic approximation to F for y near unity. 
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Mean and second moment. Let now 

' m! 

m>0 

where / m (^) denotes the Poisson generating function of E(X™). Then we deduce from (16) that 

f 1 (z) + f[(z)=2f 1 (z/2) + z, (17) 
f 2 (z) + f' 2 {z) = 2/ 2 (z/2) + 2A(z/2) 2 + 4^(2/2) + 2^(z/2) + z + z\ (18) 

with the initial conditions /i(0) = /2(0) = 0. 

Proposition 2.5. The Poisson-Charlier expansion for the mean and that for the second moment are both 
asymptotic expansions 

E(*»)= E ^M^H + 0(n-^), 

0<j<2k 

E(Xl)= E ^Mr,H + 0(n-^(logn) 2 ), 

0<j<2fc 



/orA; = l,2,.... 



Proof. (Sketch) By Lemma 2.3 and Proposition 2.4, we see that both fx, fa £ and thus we can apply 
Proposition 2.2. Indeed the proof of Proposition 2.4 provides already crude bounds for the growth order 
of fi, / 2 . The more precise estimates f\(z) x \z\ \ logz\ and f%{z) x |2;| 2 | log2;| 2 for z inside the sector 
{z : | arg(z) | < e} will be provided later in the next two subsections. □ 



An asymptotic approach based on Laplace and Mellin transforms. Once the de-Poissonization steps 
are justified, all that remains for the proof of Theorem 2.1 is to derive more precise asymptotic approxi- 
mations to fi and V (as defined in (4)). The approach we use begins with a more precise characterization 
of fi(z). Both fi and V satisfy a differential-functional equation of the form 

f(z)+f(z) = 2f(z/2)+g(z), 

with the initial condition /(0) = 0. To derive the asymptotics of / for large complex z, we proceed along 
the following principal steps; see also [10]. 

Laplace transform: The Laplace transform of / satisfies 

(s + l)jSf [/; s] = 4J2f [/; 2s] + s), (19) 
which exists and defines an analytic function if g grows at most polynomially for large \z\. 



20 



Normalizing factor: Dividing both sides of (19) by Q(— 2s) = Y[j>o(^- JrS /^) gi yes a functional equation 
of the form 

where J?[f; s] := 2 [/; s]/Q(-s). 
Mellin transform: The Mellin transform of J5f then satisfies 



Lg(-2 S ) ,u; 



Inverting the process. We first derive the local behavior of 5?[f\ s] for small s by the Mellin inversion 
(often by calculus of residues after justification of analytic properties), and then the asymptotic 
behavior of f(z) for large z is derived by the Laplace inversion, similar to singularity analysis. 

2.5 Expected internal path-length of random DSTs 

We consider in details in this subsection the expected value [i n := E(X n ) of the total internal path-length, 
paving the way for the asymptotic analysis of the variance. Starting from either the equation (17) or the 
recurrence 



0<j<n 

with /i := 0, there are several approaches to the asymptotics of jj n . We will briefly describe the one using 
integral representation of finite differences (or Rice's integrals) and then present the Laplace and Mellin 
transforms we will use, which, as will become clear, is essentially the Flajolet-Richmond approach (see 
[24]). 

Rice's integral representation. By (17), we have, with p, n := n\[z n }fi(z), 

AWi = - (1 - 2 1 -™) /2 n (n>0), 
with p,Q — 0, which by iteration yields 

An=(-l) n Qn-2, Q n := ]J (l - 2^') . (20) 

l<i<n 

Thus by Rice's formula ([27]) 



fi n :=E(X n )= ^ [ jfii 



2<j<n 

i r r(n + i)r(- s ) Q(i) 



ds 

2ttz r(n+l-s) (1 -2 l ~ s )Q(2 l - s ) ' 

where the integration path J. . is along the vertical line with real part equal to c and Q is defined in (9). 
We then obtain (7) by standard arguments; see [26] or [50] for details. 

This approach readily gives the approximation (7) for the mean and can be refined to obtain a full 
asymptotic expansion. However, its extension to the variance becomes extremely messy, as shown in [42]. 
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Laplace transform. We first show that the asymptotics of fi(z) can be derived through a direct use of 
the Laplace and Mellin transforms, which relies on several ad hoc steps that are not easily extended. A 
more general procedure will be developed below. 

By (17), we see that the Laplace transform of fi(z) satisfies the functional equation 

(s + l)jSf[/ i; s ] = 4J2f[/ i; 2s] + s- 2 , (21) 

which exists and is analytic in C \ (— oo, 0]. 

By dividing both sides by s + 1 and by iteration, we get 



; (s + 1). ■•(2^ + 1)- 



On the other hand, from (20), we have 



J ° n>0 
= J2(-V n QnS- n - 3 . 



n>0 

This implies the identity 

E(— l) n Q n _ v ^ 1 
s n+1 ~ 2s (s + l)---(2»'s + l)' 

n>0 j>0 v ' v ' 

However, neither form is useful for our asymptotic purpose. 
Now by partial fraction expansion, we obtain 

i ^ f-iy-<2-e-n 



Thus 



( s + l)...( 2 i s + l) ( s + 2-*)Q e Q j . 



i (_i)i^2-( J i +1 )-^ 



1_ 1 x ^ (-l)^-(^ 1 ) 

S 2 2s Q( 2 l s + \)2s 



Note that 



By the Euler identity 



we see that 



Qt(2 e s + l)*s q 



; Q (, + l)-.-(2i,+ l) 



E (1 _ 9 n;_ gJ) -n(^^). 



J2 7^ = = Qoo « 0.28878809 . 
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(22) 



This gives 



and then 



£>0 ^ 



Consequently, 



2 e 



^n = QooY j7V ((1 - 2- £ ) n - 1 + . 



£>0 

Asymptotically, we have, by (23) and the identity 



(24) 



the Mellin integral representation 

h[Z) 2™ y ( _ 3/2) (1 - 2-+i)Q(2'+i) QS ' 
from which we derive the asymptotic approximation 

AO) = (2 + 1) log 2 z + 2 ( 1^ + i - d + U7i(log 2 z)^j + 0(1), (25) 

uniformly for \z\ — > oo and | arg(^) | < n/2 — e, where cc^ is given in (8). (As usual, we use the asymptotic 
estimate (12) for the Gamma function.) 

Laplace and Mellin transforms. We now re-do the analysis for fi(z) in a more general way that can be 
easily extended to other cases. 

We again start from (21) and consider 



Q(-s) ' 

where Q(z) is defined in (9). Dividing both sides of (21) by Q(—2s) yields 

^^[/^R^-Lp. (26) 

We now apply the Mellin transform. Note that we have, by the fact that X = X\ = and the proof of 
Proposition 2.4, 

'0(z 2 ), ifz^O+; 
0(z 1+£ ), ifz^oo. 



h{z) 

Then 



0(s- 2 - £ ), as s -> 0+; 
0(s~ 3 ), as s — > oo. 
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On the other hand, by the Mellin transform, 

logQ(-2 S ) = ]Tlog + ± 



j>0 

1 



2iri 7(_i) (1 — 2 w )wsmiTw 



dw 



(logs) 2 logs „ , i, 

, + + > q k s~ Xk + OUst 1 ) 
21og2 2 T 1 



(27) 



uniformly for \s\ — > oo and | arg(s)| < n — e, where Xk '■= 2kni/ log 2 



log 2 7T 



2 



<?0 = —T7T + 



12 6 log 2 
and 

* = 2fcsinh(2/br/log2) ( ^ 0) ' 
This asymptotic expansion, together with the Taylor expansion 

Q(-2s) = l + 0(\s\), {\s\^0), 

gives rise to 



0(s- 2 - £ ), as s -> + ; 
0(s~ M ), as s oo, 



where M > is an arbitrary real number. Consequently, the Mellin transform of ££\f\ \ s), denoted by 
co], exists in the half-plane 9ft (u;) > 2 + e. Then by applying the Mellin transform to (26), we obtain 



^[J?M = ^^, (»(w)>2) 



where 



Gf^w) := /°° ds = ^ (2 r 2) = Q{ Z P r(w)r(l - w), (28) 



o Q(-2s) Q(l)sin7rw Q(l) 



for3?(w) > 2; see [24]. 



Inverse Mellin and inverse Laplace transforms. We can now apply successively the inverse Mellin 
and then Laplace transforms to derive the asymptotics of fi(z). Observe that G\(u) has a simple pole at 
ijj = 2. By (28) or Proposition 5 in [22], we obtain 

|Gi(c + i*)| = O (e- (w - £)|i| ) , 

for large |i| and cGl. Then by the calculus of residues, 

\ s fcez\{o} 
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uniformly for \s\ — > and | arg(s)| < it — e. Using the expansion 

Q(s) = l + s + (\s\ 2 ) (|s|~0), 

we see that 

\ S fe6Z\{0} / 

uniformly for \s\ — > and | arg(s)| < tt — e. 

Finally, we consider the inverse Laplace transform. The following simple result is very useful for our 
purposes. 

Proposition 2.6. Let f(z) be a function whose Laplace transform exists and is analytic in C \ (— oo, 0]. 
Assume that 

{0(\s\- a \\og\s + l\\ m ) , 
cs -"(-log S ) m , (29) 
o(|s|- Q |log|s + l|| m ), 

uniformly for \s\ — > and | arg(s) | < n — e, where a G WL, u G C and m = 0, 1, IfJ^lf; s] satisfies 

\^[f;s]\=0{\s\- 1 ^), (30) 

as \s\ — > oo in | arg(s)| < 7r — £, 



'Od^l^-^loglzl)" 1 ) 



/(*) = 



1 £ ( m Wr-^- 



CZ 

0<j<m 

[o(|z| Q - 1 (log|z|) m ) 



respectively, where the O- and o-terms hold uniformly for \z\ — > oo an J | arg(z)| < n/2 — e. 
Proof. Let J5f (s) = [/; s]. Then by the inverse Laplace transform, 

f(z) = 7 L f e zs J?(s) ds = -L /" e"j^( a ) ds , 
2tu 2vn J w 

where "H is the Hankel contour consisting of the two rays te ±%£ ± i/\z\, — oo < t < and the semicircle 
exp(iyj)/|z|, — 7r/2 < ip < 7r/2; see Figure 6. 

Assume from now on \z\ is sufficiently large and lies in the sector with | arg(z)| < n/2 — e. We prove 
only the O-case, the other two cases being similar. For simplicity, we consider only the case m — 0, the 
other cases being easily extended. 

We split the above integral along % into two parts 

/ e zs ^(s)ds = ^- [ e zs ^(s)ds + ^- [ e zs ^(s)ds, 



2iri J n 2ni j H> 2tti j Hd 



where 'H > comprises the two rays te ie ± i/\z\, — oo < t < —T with T > 1 a fixed constant and 1-t D 
represents the remaining contour. 
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Figure 6: The contour H. 



The integral along H > is easily estimated 



1 

2vri 



e zs ^(s) ds = 



O 



3 K(|z|e iar sM(te ie +i/|z|))u|-l 



dt 



-oo 
oo 



e -\z\tcos{aig{z)+e) ^ 



O (H £ e- C I 2 I T ) 



the O-term holding uniformly for \z\ — V oo provided that | arg(^)| + e < n/2, where c > is a suitable 
constant. 

For the second integral, we use (29). Then the integral along the semicircle is bounded as follows. 



2ttUI 



tt/2 



tt/2 



- zei9 ^ +i6 ^(e ie /\z\) d6 = (Izl - 1 ) 



uniformly for \z\ — > oo. For the remaining part t ± i/\z\, — T < t < 0, we have 



2vri 



/ rO 

e z(t±i/\z\)c£(t±i/\z\)dt = 0\ \Z 



-T 



7 c\z\t 



_ T (\z\H 2 + l) a / 2 



dt 



O \z 



ia-1 



oo c~ CU 



{u 2 + i) a i' 



■ du 



= 0(\z\ a - 1 ), 

uniformly for \z\ — > oo, where c > is a suitable constant. This completes the proof. 



□ 



Note that the inverse Laplace transform of s~ 2 log(l/s) is z log z — (1 — j)z. This, together with a 
combined use of Proposition 2.6, leads to (25). 

The justification of the estimate (30) is easily performed by using the relation (31) below. 

The Flajolet-Richmond approach [24]. Instead of the Poisson generating function, this approach starts 
from the ordinary generating function A (z) := J2 n UnZ™- 



Then the Euler transform 2 



A( S ) 



-A 



8+1 \S+1 



2 For a better comparison with the approach we use, our A differs from the usual Euler transform by a factor of s. 
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satisfies 

(s + l)A(s) = AA(2s) + s- 2 , 

identical to (21). 
- The normalized function A (s) := A(s)/Q(—s) satisfies 

1 



A(s) = AA(2s) + 

again identical to (26). 
- The Mellin transform of A satisfies > 2) 



s 2 Q(-2s) 



^{A;uj} 



1 - 2 2 -^ ' 
where Gi(cu) is as defined in (28). 

Then invert the process by considering first the Mellin inversion, deriving asymptotics of 



Ms) 



2-ni 



(5/2) 



3 i _ dw ' 



as s —7- in C. Then deduce asymptotics of 



A(z) = -A - - 1 



as z — > 1. Finally, apply singularity analysis (see [23]) to conclude the asymptotics of \i n . 



EGF 

M 
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transform 

of e- z f(z) 
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J A(z) 
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A(z) 



a of^r a C s S U aSy T iPt r tiCS Laplace Euler asymptotics )f asymptotics 
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point 




form 
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Figure 7: A diagrammatic comparison of the major steps used in the Laplace-Mellin (left-half) approach 
and the Flajolet-Richmond (right-half) approach. Here EGF denotes "exponential generating function" , 
OGF stands for "ordinary generating function" and de-Poi is the abbreviation for de-Poissonization. 

The crucial reason why the two approaches are identical at certain steps is that the Laplace transform of 
a Poisson generating function is essentially equal to the Euler transform of an ordinary generating function; 
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or formally, 

J ° n>0 ' n>0 



' (31) 



s + 1 V s + 1 

Thus the simple result in Proposition 2.6 closely parallels that in singularity analysis. While identical at 
certain steps, the two approaches diverge in their final treatment of the coefficients, and the distinction here 
is typically that between the saddle-point method and the singularity analysis, a situation reminiscent of 
the use before and after Lagrange's inversion formula; see for instance [28]. 

The relation (31) implies that the order estimate (30) for the Laplace transform at infinity can be 
easily justified for all the generating functions we consider in this paper since A(0) = 0, implying that 
A(z) = 0(\z\) as |*| ->• 0. 

This comparison also suggests the possibility of developing de-Poissonization tools by singularity 
analysis, which will be investigated in details elsewhere. 

2.6 Variance of the internal path-length 

In this section, we apply the Laplace-Mellin-de-Poissonization approach to the Poissonized variance with 
correction 

V{z) := f 2 (z) - }\(zf - zf[(z) 2 , 

aiming at proving Theorem 2.1. The starting point of focusing on V instead of on f 2 removes all heavy 
cancellations involved when dealing with the variance, a key step differing from all previous approaches. 

Laplace and Mellin transform. The following lemma will be useful. 
Lemma 2.7. If 

" f 1 (z)+f[(z) = 2f 1 (z/2)+h 1 (z), 
f 2 (z)+f^(z) = 2f 2 (z/2) + h 2 (z), 

where all functions are entire with /i(0) = f 2 (0) = 0, then the function V(z) := f 2 (z) — fi(z) 2 — zf[(z) 2 
satisfies 

V(z) + V'(z) = 2V(z/2) + ~g(z), 

with V(0) = 0, where 

~g(z) = zf'({zf + h 2 (z) - hiz) 2 - zh\{z) 2 - Ah x {z)h{z/2) - 2zh' 1 (z)R(z/2) - 2~h{z/2) 2 . 
Proof. Straightforward and omitted. □ 
By using the differential-functional equations (17) and (18) for fi(z) and f 2 (z), we see, by Lemma 2.7, 

that 

V(z) + V\z) = 2V{z/2) + zfl{z) 2 , (32) 

with V(0) = 0. 

Before applying the integral transforms, we need rough estimates of V(z) near z = and z = oo. We 
have 

-, (2)= |0M, a S2 ^0 + ; 
I 0(z +e ), as z -> oo. 
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These estimates follow from 

.£W'=H* aS '1" 0; (34) 
I 0(\z\ ), as \z\ — > oo, 

which in turn result from X = X\ = and (25) (by the proof of condition (I) of Proposition 2.4). Indeed, 
the proof there shows that the same bounds hold uniformly for z E C with | arg(z) \ < n/2 — e. 

We now apply the Laplace transform to both sides of (32). First, observe that the Laplace transform of 
V(z) exists and is analytic in C \ (— oo, 0]. Then, by (32), 

(s + l)Jf[V) s] = 4^[V; 2s] + g*(s), 

where g*(s) : = Jf[zf"; s\. Next the normalized Laplace transform 



Q{-s) 



satisfies 



By (33), we obtain 



J*?[V ; s] = A^[V; 2s] + 



Q{-2s) 

0(s~ 2 - £ ), as s 0+; 
Ofs -3 ), as s — > oo. 



From this and the asymptotic expansion (27) of Q(—2s), it follows that the Mellin transform of Jf[V; s] 
exists in the half -plane 9?(u;) > 2 + e. Consequently, 



^[J?[V;s];u} = ^^, (»(«)> 2), 



where 

g*(s) 1 r °° poc 



G 2 (u) := Jt 

By (23), we have 



Q(-2s) 



id 



Q(-2s) 



e- zs zfl(zY&z&s. (35) 



h,e>o 

Substituting this and the partial fraction expansion 



-z/2 h -z/2 e 

Q h Qi2 M 



I 1 y (-1)^2- 



Q(-2s) Q ao j-<Q j ( 8 + 2-j] 



into (35), we obtain (10). 
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Inverse Mellin and inverse Laplace transforms. For the Mellin inversion, we need more precise ana- 
lytic properties of G 2 (u). By (34), we deduce that the Laplace transform g*(s) of zf"(z) 2 satisfies 



0(| logs|), as \s\ ->■ 0; 
0(\s\~ 2 ), as \s\ — > oo 



uniformly in the cone | arg(s) | < n — e. Thus, by the asymptotic expansion (27) for Q(— 2s) and Proposi- 
tion 5 in [22], we have 

\G 2 (c + it)\ = O (e-^- e >l*l) , 

for large \t\ and c > 0. Also the Mellin transform G 2 of g*(s)/Q(—2s) exists in the half-plane 9fi(u;) > 0. 
Consequently, by standard calculus of residues, 

^[V; s] = E ^(2 + x fe ) S - 2 - xfc + o(| S n, 

uniformly for \s\ — > and | arg(s)| < rr — e. This in turn yields the following expansion for Jzf [V"; s] 

^ s ] = ^ E + X.K 2 -** + ^2 E + X*)*- 1 -*" + 0(| S |-), 
° fcez ° fcez 

again uniformly for \s\ — > and | arg(s)| < rr — e. 
Finally, standard Laplace inversion gives 

() lo s 2 ^ r ( 2 + ^) + iog2^r ( i + Xfe) ^ +0 ^' >• (36) 

uniformly for \z\ — > oo and | arg(z)| < n/2 — e. 

Since / 2 (*) = V"(*) + h(z) 2 + zf[(z) 2 , we see from (36) and (25) that 

Uz) x f^z) 2 x \z\ 2 log 2 |z| (I arg(z)| < vr/2 - e). 

This proves Proposition 2.5 and Theorem 2. 1 by straightforward expansion. More refined calculations give 

2 

~ 7i ~ n ~ 

V(X n ) = V{n) - -V"(n) - -fl{n) 2 + 0{n~ l ), 

the two terms following V(n) being both 0(1) and periodic in nature. It is possible to further extend 
the same idea and derive a full asymptotic expansion, which has also its identity nature; details will be 
presented in a future paper. 



3 Bucket Digital Search Trees 

In this section, we extend the same approach to bucket digital search trees (6-DSTs) in which each node 
can hold up to b keys. The construction rule is the same as DSTs, except that keys keep staying in a node 
as long as its capacity remains less than b; see Figure 8 for a simple example with 6 = 2. DSTs correspond 
to b = 1. 

Note that when b > 2 we can distinguish two different types of total path-length: the total path-length 
of all keys (summing the distance between each key to the root over all keys), which will be referred to as 
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the total key-wise path-length (KPL) and the total path-length of all nodes (summing the distance between 
each node to the root over all nodes, regardless of the number of keys in each node), referred to as the total 
node-wise path-length (NPL). When 6 = 1 the two total path-lengths coincide. For simplicity, we will use 
KPL and NPL, dropping the collective adjective "total". While the expected values of both TPLs are of 
order n log n under the same independent Bernoulli model, their variances surprisingly turn out to exhibit 
very different behavior; see Table 1 . 



010111 
101011 
100001 
011011 
111110 
110111 
010011 
011110 
000100 




000100 



(iioiii) 



Figure 8: A 2-DST with nine keys. The total key-wise path-length is equal to4xl + 3x2 = 10 and the 
total node-wise path-length equals 2x1 + 3x2 = 8. 



3.1 Key- wise path-length (KPL) 

We assume the same independent Bernoulli model for the input strings. Let X n denote the KPL in a 
random 6-DST built from n random stings. Then by definition and the independence model assumption 

X n+b ^X Bn +X* n _ Bn + nj (n>0) (37) 

with the initial conditions X = ■■■ = Xb-i = 0. Here B n ~ Binomial(n, 1/2), X n — X*, and 
X n , X*, B n are independent. 

Known and new results. Hubalek [30] showed, by the Flajolet-Richmond approach, that the mean 
satisfies 

E(X n ) = (n + b) \og 2 n + n (c 2 + tu 3 (log 2 n)) + c 3 + w A {\og 2 n) + O [n^ 1 logra) , 

where c 2 , c 3 are effectively computable constants and zu 3 and w± are very smooth periodic functions. He 
also proved that the variance is asymptotically linear 

Y(X n ) = n(C h + w h (\og 2 n)) + 0((logn) 2 ), 

where Ch is expressed in terms of a very long, involved expression and Wh is a periodic function. 

We improve this estimate by deriving a much simpler expression for the periodic function, including 
its average value CV To state our result, we define the following functions. Let 




0<j<b VJ/ V 7 
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It is easily seen that g(z) is of the form 

§{*) = 



2<ii,i 2 <6+l 



(39) 



2<ji,i 2 <6 



where gi lt i 2 , g' ilji2 > are given explicitly by 

b\ ( b\ ( b\ (b — i\ 



ii + 1) 



9i\,i 2 



b \/ 6 \ _ / 6 \/6_i 1 + i 
ii-lJU-lJ U-lA i 2 -l 



b — i\ 

%X — \)\%2 — \ 



both coefficients being symmetric in zi and z 2 . Define 

G 2 (u) = 



oo g^—l /' ;x 



Q(-2s)»J Q ^'M****, 
which is well-defined for $l(u) > 0, as we will see later. 

Theorem 3.1. The variance of the total key-wise path-length of random b-DSTs ofn strings satisfies 

Y(X n ) =n(C h + zu h (\og 2 n)) + 0(l), (40) 

where 

r<j-">\ i r x k r x 

e ^(z) dzds, 

logz log z j ./o 

and 



g 2 (2) = r°° s r 

log 2 log2 7 Q(-2s) b J 
ro ^) = T^ E 



C 2 (2 + Xk) e 2knit 



log 2 ^ r(2 + Y fe ) 
6 fcez\{o} v Ay 

By straightforward truncations, expansions and approximations, we obtain the following numerical 
values for b = 1, . . . , 5. 



b 


1 


2 


3 


4 


5 


c h 


0.26600 


0.13260 


0.09004 


0.06958 


0.05781 



More powerful means are needed to be developed if more degree of precision is required. 

Generating functions. From (37), it follows that the moment generating function M n (y) := E(e Xny ) 
can be recursively computed by the relation 



M n+b (y) = e ^ E 



(y)M n ^(y) (n > 0), 



0<j<n 



with M n (y) = 1 for < n < b. The bivariate exponential generating function F(z, y) then satisfies the 
equation 

Flll y) =F l-,y 
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with Fw(0, y) = 1 for < j < b, and we have the nonlinear equation for the Poisson generating function 

F(z,y):=e- z F(z,y) 

J2 ( b )F^^y)=e^- 1 >F( e ^y)\ (41) 

0<j<b ^ ' 

withF(0,y) = 1. 

From this form, the asymptotic analysis of the mean value and that of the variance proceed along 
exactly the same line we developed in the previous section. Thus we briefly sketch the principal steps of 
the analysis, leaving the details to the interested reader. 



The expected value of X n . From (41), we derive the following differential-functional equation for the 
Poisson generating function of the mean 



£ ( b )ff\z) = 2f l (z/2)+z, 



0<j<b 

with the initial conditions (0) = for < j < b. 

Before applying the Laplace-Mellin approach, we need first a transfer-type result similar to Proposi- 
tion 2.4. 

Proposition 3.2. Let f(z) and g(z) be entire functions satisfying 



E ( b )f {j) (z) = 2f(z/2)+g(z), (42) 

with f(0) = 0. Then 

gej?y / e JSf. 

Proof. (Sketch) The same proof as that for Proposition 2.4 applies mutatis mutandis to (42). The only 
difference is that we now have 

f^(z) = 2e^ 2 f(z/2)+g(z), 
where f(z) := e z f(z) and g(z) := e z g(z), so that (14) has the extended representation 

/(*) = j\* ~ if' 1 (2e'/ 2 /(t/2) + g(t)) dt 

z b 

- (6-1)! 

and 



[l-tf- 1 (2e tz l 2 f{tz/2)+g{tz)) dt, 



/» = j^Y), J\l ~ t) b e-O-0* (2/(te/2) + g(zj) dt. 
All required estimates can be derived by the same arguments used there. □ 
The Laplace transform of fi now satisfies the functional equation 

(s + l) b ^[f 1 ;s]=A^[f 1 ;2s] + s~ 2 , 
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for !R(s) > 0. From this equation, we obtain 

" Sf ^ i; S] = ? ^ ( s + i)fe...(i + 2 ^)"' 

which extends (22). From this series and partial fraction expansions, we can derive a close-form expression 
for fi(z), which becomes messy especially for large b. Define as before Jz?[/i; s] := ££\f\\ s]/Q(— s) 6 . 
Then we obtain 

^[/ 1 ; S ] = 4^[/ 1 ;2. S ] + ^-l F? . 

This relation is almost the same as (26). Thus the same Mellin analysis given there carries over and we 
deduce that 



^[ks] = \\o g2 l - + ^\ l - + -%- + -±- £ Gi(2 
s 2 s s 2 \ 2 log 2 log 2 ^ 

\ ta to fcez\{o} 



Xfc)s Xfe 



+ -logi + 0(| S |- 1 ), 
s s 

uniformly for \s\ — > and | arg(s)| < n — e, where 



oo s ^-3 



and 

1 



C4 := lim G\(uj) — 



oj^2 \ 00 — 2 

1 1 / i \ . r \ 



-1 ds+ — - . , ds. 



o s \Q(-2s) b J J 1 sQ(-2s) b 
Consequently, by the Laplace inversion, 

\ ta ta fcez\{0} v A 7 / 

uniformly for \z\ — > oo and | arg(z)| < n/2 — e. From this and Propositions 3.2 and 2.2, we obtain 

E{X n )= ^^(nJ + O („"!+*), 

0<j<2k 

for any = 1,2, Finally, 
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Variance of X n . The analysis here is again similar to that for the mean. Let / 2 (;z) denote the Poisson 
generating function of the second moment E(Jf 2 ). Then, by (41), 

E C) fPW = 2 ^(^/2) + 2/>/2) 2 + 4zA(z/2) + 2zf[(z/2) +z + z 2 , 

0<j<b 

with the first 6 Taylor coefficients zero. Define again 

V{z) = f 2 {z)-h{z) 2 -zf[{z) 2 . 

Then V"(z) satisfies 



0<j<b 

where g(z) is given in (38). 

By the representations (39) and (43), we have 



£ ( b )v®(z) = 2V(z/2)+g(z), 



01* 



0(|z|), as |z| ^0; 
0(|z| _1 ), as |z| — ► oo, 



uniformly in the sector | arg(z)| < n/2 — e. This is similar to the corresponding estimate (34) in the 
analysis of the variance in the previous section. The same procedure there applies and we deduce (40). 

3.2 Node-wise path-length (NPL) 

We consider in this section the total node-wise path-length (NPL). Under the same independent Bernoulli 
model, we still use X n to denote the NPL in a random 6-DST of n binary strings with node capacity b > 2. 
Also let iV n stand for the total number of nodes (space requirement) in random 6-DST of n strings. Despite 
its being one of the most natural shape measures for 6-DSTs, the consideration of X n here seems to be 
new. For N n , it is known that the distribution is asymptotically normal with the mean and the variance both 
asymptotically n times a different smooth periodic function; see [31]. In contrast to (40) for the variance 
of KPL, what is unexpected and surprising here is that the variance of X n is of order ra(logn) 2 . 

Theorem 3.3. Assume b > 2. The mean of N n and that of X n satisfy the following asymptotic relations. 
E(iV n ) =nPi j0 (log 2 n) + O(l), 



E(X n ) = n(log 2 n)Pi )0 (log 2 n) +nP® (log 2 n) + (log 2 n)pg(log 2 n) + 0(1); 

and the variances of N n and X n satisfy 

V(i\y = nP 2 , (log 2 n) + O(l), 
Cov(iV n , X n ) = n(log 2 n)P 2 , (log 2 n) + nPf\i\og 2 n) + (logn)P^(log 2 n) + 0(1), 
VpQ = n(log 2 n) 2 P 2i0 (log 2 n) + n(log 2 n)P c | 2 ](log 2 n) + nP^ (log 2 n) 
+ (logn) 2 P,g(log 2 n) + (log 2 n)P$(log 2 n) +0(1), 

where the P. . 's are all computable, smooth, 1-periodic functions. 



(44) 



(45) 
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Intuitively, that the variance of NPL is larger than that of KPL can be seen from the definition of NPL, 
which depends on the random variable N n (see (46)), while on the other hand, KPL depends on n only (in 
addition to on the two subtrees). The following figure shows the first few values of the variance of NPL 
and that of KPL. 




We see that the variance of NPL increases faster than that of KPL. 

Note that the periodic functions of the dominant terms are all equal, implying that the correlation 
coefficient of N n and X n is asymptotically 1. 

On the other hand, the mean value c a of P x (t) is given by 



1 



Cl,0 



log 2 



\6-l 



+ a, 

Q(-2s)> aS ' 



numerical approximations to ci for the first few b are given as follows. 





b 


1 


2 


3 


4 


5 


6 




c l,0 


1 


0.57470 


0.40698 


0.31594 


0.25849 


0.21885 


Note that when b 


= 1 




c l,0 : 


= 1 f 


30 ds 


= 1, 










log2 7 


Q{-2s) 





by (28), which is consistent with the fact that N n = n in this case. 

When b = 2, we see that about 42.5% of nodes on average contain two keys and 14% of nodes a single 
key. The storage utilization is thus not very bad. 

From (44) and these numerical values, we see that, in contrast to the expected KPL, which is asymptotic 
to n log 2 n for all b, the expected NPL provides a better indication of the "shape variation" of random b- 
DSTs. 

Our analysis is based on the following straightforward distributional recurrences 



N n+b = N Bn + N* n _ Bn + 1, 
X n+h = X Bn + X*_ B + N Bn 



X 



(n > 0), 



with the initial conditions Xn 



X, 



6-1 



n—B n ' 

1 and X, 



o 



X, 



6-1 



(46) 



0. Here again 



B n ~ Binomial(n, 1/2), N n = X*, X n = X* and X n , X*, B n as well as N n , X*, B n are independent. 
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Generating functions. Define M n {u, v) = E,(e NnU+XnV ) . Then (46) translates into the recurrence 

M n+b (u, v) = e u 2- n ( H ) M A U + v , v)M n -j{u + v,v), (n > 0), 

3=0 ^ ] ' 



with M (u, v) = 1, Mt(u, v ) = • • • = M^u, v) = e u . Next, let 

F(z,u,v):=J2 MnM ~ n 

Then the recurrence relation gives 



z . 



n>0 



g^F{z,u,v) = e u F[-,u + v. r 
and the Poisson generating function F(z, u, v) := e~ z F(z, u, v) satisfies 

E ( b ^ZjF(z,u,v) = e^F<^,u + v,v) 2 , (47) 



0<j<b 



with the initial conditions F(z, u, v) — 1 + (e u — 1) J2 1<j<b (— l) j 1 z j /j\ + 
For the moments, if we expand F(z, u, v) in terms of u and v, 



m>0 ' 0<j<m ^ ' 



then fj,m-j{z) is the Poisson generating function of K(N^X^~^). Thus all moments of X n and N n or 
their products can be computed by taking suitable derivatives of (47) with respect to u and v and then 
substituting u = v = 0. 

Expected number of nodes and expected node-wise path length. By taking first derivatives of (47), 
we obtain 



E ( b )f^) = 2fl (z/2) + l, 



0<3<b 



AN (48) 
E (°)f^(z) = 2f 0>1 (z/2) + 2f\ (z/2), 



m 

K 0<j<b 

the initial conditions being / 1)0 (0) = 0, fj$(0) = for 1 < j < b and /^(O) = for < j < b. 

We can apply the Laplace-Mellin approach as before, starting from the mean of N n . Note that 



J2?[/W; s] = s^[h s] - E ^"^(O) (j = 0, 1, . . . ), 

0<£<7 

provided that the Laplace transform exists for 9?(s) > 0. This gives 

(s + l) fe ^[A, ; s] = 4J2f[/ 1>0 ; 2s] + #* (s), 



37 



where 

0<£<b-2 £<j<b-2 \ J ' 

1 fb - 1 N 

= 7+ E 



S * ' \ 7 



s^ 1 



s'Hs + l 



,6—1 



k,o;a];w] = T ^i, (»(«)> 2), 



Unlike all previous cases, iterating this functional equation leads to a divergent series. Although this 
problem can be solved by subtracting a sufficient number of initial terms of fifl(z), the approach we use 
does not rely on this and avoids completely such a consideration. 

Let J^[/i,o; s) := J2?[/ lj0 ; s]/Q(-s) 6 . Then 

Gi,oM 
1 -2 2 

where 

COO S w ~ 2 

Q(-2s) f 

forSR(a;) > 1. 

From this, we deduce that 

/ liO (*) = zPa,oOog 2 z)+0(l), (49) 
uniformly for \z\ — > 00 and | axg(z)| < n/2 — e, where Pi,o(£) is a periodic function with the Fourier 
series representation 



P / + \ ._ 1 ^l,0( 2 + Xfc) ^ 2 fc7T»f 



the series being absolutely convergent. From this we deduce the first approximation of (44). 
We now turn to the expected NPL E(X n ). By (48), we have 

(s + l)^[/ ,i; s] = 4J2f [/ 0j i; 2s] + 4Jgf [/ 1>0 ; 2s]. 

Let j£[/ 0>1 ; s] := J^[/o,i; s]/Q(-s) 6 . Then 

^t/o,i;H = 2 ( ^ 1 A ) , (»(w)>2). 

From this we deduce that 

f 0)1 {z) = z(\og 2 z)pW(log 2 *) + ^P«J 2 l(log 2 *) + (log 2 z)P$(log 2 z) + 0(1), (50) 

uniformly for \z\ — V 00 and | arg(^)| < 7t/2 — e, where ioi(t), -P(j 2 ](t), ioi(t) are smooth, 1-periodic 
functions whose Fourier coefficients are given by 



lo S 2 ^ r ( 2 + ^) 



p [2] M _ 1_ Gi,o(2 + XfcM2 + XiQ-Gi,o(2 + Xfc) ^ 

(bg 2)^ r(2 + Xfe ) e ' 



,[4]/,x _ v-" <ji,o(2 + Xk) „2kirit 



^ litJ "k)e2^ rri 



log 2 ^ r(i + x*) 
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Here ip(z) denotes the derivative of log T(z) and all series are absolutely convergent. This proves (44). 



Variance. Taking second derivatives in (47) and substituting u = v = gives 

' E { h )h%) = ^ko{z/2)+2h fi {z/2f+Ah fi {z/2) + ^ 

< E ( b ) fftW = 2 A.i(«/2) + 2/2,o(^/2) + 2(/ lj0 (z/2) + / 0il (z/2))(A i0 (z/2) + 1), 

0<j<b ^ ' 

E = 2/0,2(^/2) + 4/^(^/2) + 2/0,2(^/2) + 2(f l>0 (z/2) + /o,i(^/2)) 2 , 

k 0<j<b 

with the initial conditions /^(O) = for 1 < j < b and / 2)0 (0) = /JJ(0) = /^(O) = 0, for 

< j < b. 

The remaining calculations follow the same pattern of proof we used above but become much more 
involved. We begin with 

V{z) = h fi {z)-h*{z?-zf[ fl {z)\ 
U{z) = - /i,o(*)/o,i(*) - zft,o(z)fa tl (z), 

W(z) = f 0;2 (z) - f^{zf - zf^(z) 2 . 

Then we deduce 

E ( b )v^(z) = 2V(z/2) + ~g 2fi (z), 

0<j<b ^ ' 

< ( b )u^(z) = 2U(z/2)+g hl (z), 

E ( b )w^(z) = 2W(z/2) + ~g 0>2 (z), 

, 0<j<b ^ 
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where 



92,0{ Z > 



\0<j'<fc 



£ (■) (A,o(-) 2 + <o(^ 



Cj) 



0<i<b 



S&, x (z) = 2V-(z/2) + ( £ 



0<j<b 



HE /! 



-E 



0<i<6 

6 



E 

e C)/ro 



\0<j<b 



90,2( Z ) 



h,o( z )fo,i( z ) + Z A,o( Z )fo,l( Z 
4tf (*/2) + 2V(s/2) + ( £ (-)/o4 

• ( E Q/TwY-E 

\0<?<& VJ/ / 0<j'< 



U) 



f 0A (z) 2 + zf> A (z) 



U) 



\0<j<b VJ/ / 0<j<6 

The initial conditions are 7(0) = tH^O) = W^0(0) = f or < j < b and 

y0')(0) = (-iy (1 + (j - 2)2^) , (1 < j < b). 
From (49), (50) and Ritt's theorem (see [54]), we have 

&o(s) = O flat 1 ) , 
^ 1)1 («)-2y(V2) = 0(|2|- 1 ), 
£0,2(2) - 4tf(z/2) - 27(z/2) = O fl^r 1 ) , 

uniformly for |z| -)■ 00 and | arg(z)| < n/2-e. LetJ?{A; s] := .if [A; where A G {7, J7, 

Then we obtain, for 9R(u;) > 2, 



Jfr[Sf[V;s];u] 



1-2 2 - 



4#;S | ; „I = ^/«M 



1 -2 2 - 



J?[Sf[W;s];u}] 



(1 - 2 2 - 

2 5 - 2 -G 2i0 (a;) , 2 2 - (2G y (w) + G 2) oH) , Gyfw) + G 0)2 (w) 



(1 -2 2 - 



where 



Gi,i(w) := 
G , 2 (w) := 



00 



^[^2,0; s] + 



(1 - 2 2 ~ w ) 2 1 - 2 2 - w 

(5 + It 1 - (26 - 3 + (6 - l)s) 



o Q(-2a)» V~ L ^ J ' (s + 2) 2 

e~ sz (g 1>1 {z)-2V{z/2)) dzds, 



ds, 



00 g w ~i / :v 



Q(-2H f 



00 e^ -1 



Q(-2H b 



£0,2(2) - 2V(z/2) - 4U(z/2) dz ds, 
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with all functions analytic for $t(u) > 0. Consequently, we deduce (45). 



4 Digital search trees. II. More shape parameters. 

We consider in this section four additional examples on DSTs whose variances are essentially linear. The 
same tools we use readily apply to 6-DSTs, but we focus on DSTs because the results are easier to state and 
the asymptotic behaviors do not differ in essence with those for the more general 6-DSTs the corresponding 
expressions of which are however much messier. 

The first parameter we consider is the so-called ^-parameter (see [16]), which is the sum of the subtree- 
size of the parent-node of each leaf (over all leaves) 3 . Instead of w-parameter, we call it the total peripheral 
path-length (PPL), since it measures to some extent the fringe ampleness of the trees. Also this is in 
consistency with the two previous notions of path-length we distinguished. 

Then we consider the number of leaves, which has previously been studied in details in [26, 31, 39] 
and which is well connected to PPL. Our expression for the variance simplifies known ones. 

Yet another notion of path-length we consider here is the so-called Colless index in phylogenetics, 
which is the sum of the absolute difference of the two subtree-sizes of each node (over all nodes). We call 
this index the total differential path-length (DPL) as it clearly indicates the balance or symmetry of the 
tree. Another widely used measure of imbalance in phylogenetics is the Sackin index, which is nothing but 
the external path-length. 

The last example we consider is the weighted path-length (WPL), which often arises in coding, opti- 
mization and many related problems. 

The orders of the means and the variances exhibited by all the shape parameters we study in this paper 
are listed in Table 1 . 

4.1 Peripheral path-length (PPL) 

The PPL (or w-parameter) was introduced in [16], the motivations arising from the analysis of compression 
algorithms. We start from the fringe-size of a leaf node A, which is defined to be the size of the subtree 
rooted at its parent-node; see Figure 9. The PPL of a tree is then defined to be the sum of the fringe-sizes 
of all leaf-nodes. Let X n denote the PPL in a DST built from n random binary strings under our usual 
independent Bernoulli model. 



Figure 9: The two possible configurations of the fringe of a leaf: the fringe-size (or w-parameter) equals 
\T\ + 2. Note that T may be empty. 

Drmota et al. showed in [16] that 





E(X n ) 



n (C w + w w (\og 2 n)) + o(n) 



(51) 



The leaves or leaf-nodes of a tree are nodes without any descendants. 
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where 



^>0 ^ \fc>l / to f>0 



2^-1 



Note that by (24), we have the identities 



(j + 2) 1 1 / V 1 \ 

^ goo ^(2i-l) 2 ^2i + lJ 

^ 2£ - 1 1 /y> 2 



2 



The asymptotic behavior (51) is to be compared with the nlogn-order exhibited by most other log-trees 
such as binary search trees and recursive trees; see [16]. It reflects that most fringes of random DSTs are 
small in size; see Figure 3. Indeed, since the expected number of leaves is also asymptotic to n times a 
periodic function, the result (51) implies that the average size of a fringe in random DSTs is bounded. We 
show that the standard deviation is also small. 
Define 

g 2 ( z ) : = Z f'({zf - ±- e~* (z 4 + Az 3 + 16z 2 - 8z + 64) 

(52) 

- Z - e"^ 2 {A(z + 4)f x (z/2) - 2(z 2 + 2z + 8) f fa/2) -(z + 2)(z + 8)) , 

where fi(z) represents as usual the Poisson generating function of K(X n ). Let G 2 (u) denote the Mellin 
transform of Jzf [g 2 ; s]/Q(—2s). 

Theorem 4.1. The mean and the variance of the total PPL X n of random DSTs of n strings satisfy 

E{X n ) =n(C w + w w (\og 2 n)) + 0(1), 

Y(X n ) = nP w (\og 2 n) + 0(l), (53) 
where P w (t) is a smooth, 1 -periodic function with the Fourier series expansion 

the series being absolutely convergent. 

We provide only the major steps of the proof since it follows the same approach we developed above. 

Recurrence and generating functions. By definition and by conditioning on the size of one of the 
subtrees of the root, we have the following different configurations 



G 2 {2 + Xk) 2k7Tit 
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X: 



3 



from which we derive the recurrence for the PPL 

{Xn-x, with probability 2 2 ~ n ; 

n + X n _ 2 , with probability (n - l)2 2 ~ n ; 
X k + X*_,_ k , with probability 2 l ~ n ( n ~ 1 ) , 2 < k < n - 3, 

where X = X\ = 0, X 2 = 2 and X 3 has the distribution 

6 , with probability 1/2; 
2, with probability 1/2. 

From this recurrence, it follows that the bivariate Poisson generating function 

F(z,y) :=e -^^-^z- 

n>0 

satisfies the nonlinear equation 

Hz, y) + §-/(z, y) = F g, y) 2 + z<»*"*l*-F y) 

- Z e-'>F(Z,y) + £e- (e^ - l) 2 , 
with the initial condition F(0,y) = 1. 

The expected PPL. By (54), we obtain the differential-functional equation for fi(z) by taking derivative 
with respect to y and then substituting y = 1, giving 

+ = 2A(z/2) + z(2 + Z /2)e" 2 / 2 , (55) 

with /i(0) = 0. The Laplace transform of fi satisfies 



(54) 



s + 1 u ' J (l + 2s) 3 

4 fc 



16 E 



, o (s + l)---(2 fc - 1 s + l)(2 fc + 1 s + l) 3 ' 
Then a straightforward application of the Laplace-Mellin-de-Poissonization approach yields 

log 2^ r( 2 + Xfc ) 



where 



gW -i 



G x {u) := 16 / — Wo ; ds > 0). 



o Q(-s)(2s + l)* 

The 0(l)-term can be further refined by the same analysis. In particular, we get an alternative expression 
for C w 

G (2) 16 f°° s 

C w = —^- = - / —. — -ds ^ 1.10302 66959- •• . 

log2 lo g 2i Q(-s)(2s + l)3 

That the two expressions of C w are identical can be proved by standard calculus of residues; see [24] for 
similar details. 
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The variance of the PPL. Again from (54), we derive the equation for the Poisson generating function 
/ 2 (^) of the second moment of X n 



f 2 (z) + f' 2 {z) = 2f 2 (z/2) + 2A(z/2) 2 + ~ z 2 e~ z 



+ ze~^ ( (z + A)h{z/2) + zf[{z/2) + * + ^ + 16 



(56) 



with / 2 (0) = 0. 

Let V(z) = f 2 (z) - ft(z) 2 - zf[(z) 2 . Then, by (55), (56) and Lemma 2.7, 

V(z) + V'(z) = 2V{z/2) + ~g 2 (z), 

with V(0) = 0, where g 2 is defined in (52). 

Applying again the Laplace-Mellin-de-Poissonization approach, we deduce (53). In particular, the 
mean value of the periodic function P w is given by 

G 2 {2) 1 f°° s 



log 2 log 2 



r°° s f°° 

Jo QpSih e ^ dzds - 



4.2 The number of leaves 



The leaves of a tree are the locations where the nodes holding new-coming keys will be connected; thus 
different types of data fields can be used to save memory, notably for 6-DSTs. The number of leaves then 
provides a quick and simpler look at the "fringes" of a tree. Such nodes are sometimes referred to as the 
external-internal nodes or internal endnodes in the literature; see [16, 26, 41, 56]. 

Let X n denote the number of leaves in a random DST of n keys. Then X n satisfies the recurrence 

X n+l ±X Bn +X* n „ Bn (n>l), (57) 

with X = and X\ — 1, where B n ~ Binomial (n; 1/2). 

Flajolet and Sedgewick [26], solving an open question raised by Knuth, showed that 

E(X n ) = n (C fi + w fs (\og 2 n)) + 0(n 1/2 ), 

where Wf s (t) is a smooth, 1-periodic function and 

c fi = 1 + Yl -q^k Yl ^731 - [ ^ + ( J2 ) ~ Y 

h>\ i<j<k y 6 \fc>i / k>i 

« 0.37204 86812- ■ ■ . 

A finer approximation, together with the alternative (and numerically better) expression 

{-l) k k 



a 



■fi 



Y 2 k-\ Q \ l og 2 + ^ 

fc>l \ to k>l 



Q k {2 k - l)2 fe ( fe + 1 )/ 2 



was derived by Kirschenhofer and Prodinger [39]; see also [56]. They proved additionally the asymptotic 
linearity of the variance 

V(X n ) ~ n (C kp + w kp (\og 2 n)) , 
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where w^ p is a smooth, 1-periodic function with mean zero and a long, complicated expression is given for 
the leading constant C^ p . We derive different forms for these two asymptotic approximations. 
Define 

g 2 (z) = zf'({zf + e- z (l - e-\\ + z) + 2zf[(z/2) - 4A(z/2)) , (58) 
where }\{z) := e~ z £ n > HX n )z n /n\. 

Theorem 4.2. The mean and the variance of the number of leaves are both asymptotically linear with the 
approximations 

MX ) = — V Gl(2 + Xk) + 0(1) 

HXn) log 2^ r(i + x *) [ h 

Y(X ) = — V G2i2 + Xk) + O(l) 
y{Xn} log 2^ r(2 + Xfc ) + 1 j ' 

where the two series are absolutely convergent with Gi, G 2 defined by 



Gi(cj) = / r— ; ds, 

/or > 0. 

We see in particular that 

1 f°° s 
Cfs= h^2j ( s + l)Q(-2s) dS ' 



1 /"OC /-CO 

log 2 i Q(-2s) Jo 

Sketch of proof. From (57), we derive the equation for the bivariate generating function F(z,y) := 
e' z En> He Xny )z n /n\ 

F{z, y) + ^F{z, y) = F (~ y) \ {e* - 1) e" 2 , 

with F(Q, y) = 1. Then the Poisson generating functions of the first two moments satisfy 

f 1 (z) + f[(z) = 2/^/2) +e~ z , (60) 
f 2 (z) + f 2 {z) = 2/ 2 (z/2) + 2A(z/2) 2 + e~ z , 

with /i(0) = /2(0). Consequently, the function V(z) := f 2 (z) — f\(z) 2 — zf[(z) 2 satisfies 

V(z)+V'(z) = 2V(z/2)+g 2 (z), 

with V"(0) = 0, where g 2 is given in (58). The remaining analysis follows the same pattern as above and is 
omitted. 
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We provide instead some details for the numerical evaluation of the constant as defined in (59), 
which is similar to the case of internal path-length of DSTs. 

By applying the Laplace transform to both sides of (60) and by iteration, we get 



S] = S 5 + X )( 2s + !)••■ + l)(2 fc s + l) 2 ' 



Since the inverse Laplace transform derived from the partial fraction expansion of this series is divergent, 
we consider the function fi(z) := fi(z) — z + ^ 2 /2 for which the equation (60) becomes 

fx(z) + f[(z) = 2/ 1 (s/2) -l + z+ Z l + e - } 

with /i(0) = 0, and we have 

Then by the partial fraction expansion 



2s 3 2 k (s + !)••• (2*-^ + l)(2 k s + 1) 



2 ' 



3-2 fe s + l _ ^ (-1)^(3 ■ 2 k - e - l)2~( k 1 



( a + l)...(2*-i s + l)(2*s + l)2 Q ^ fc (2*-^-l)Q / Q fc _ < 2^ + 1 



+ Q fc ( 3 + 2 ^ 2*-l) 2*a + l 



2 



Q fe (2 fc s + 1} 



we obtain 



^E^Q, ( 2 ^ + i 



^>o 

where 



(2's + l) 2 



1 (-l) j (3 ■ 2* - 1)2~( J 2 1 ) 
^-3 + 2 ^ + ( 2 . - 1)2^- 

Obviously, lim^oo Si = 4. Now, by the inverse Laplace transform, 

AC*) 



"2' +1 (3-^l + 5 4-3e-^) + 2^ 

which converges for all z; also from [26] we have 

{-\) n ~ x z n ^ 1 



n>3 0<j<n-2 ^ VJ 7 
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Then the first and the second derivatives are given by 

/:w = 5 E ^ (* (-1 + «/* + + 4 - ^ - - 



Z _ e -z/2 e 



£>0 



Now the constant C^ p can be expressed in terms of the integrals of fx as follows. 

oo „ poo 







+ 2 / -A— / e-' (s+1 > ' 



Q ( -2s). k - j(/;(,/2)-,)d,d, 
And we get C kp « 0.034203 

A general weighted sum of node-types for 6-DSTs. For 5 > 2, we can consider Xn, 1 < j < b, the 
number of leaves containing j records in a random 6-DST with bucket capacity b built from n records. Let 
also Xn +1 ' be the number of internal (non-leaf) nodes. Define 



X n — OjX 

l<j<6+l 



[J] 



where ai, . . . , ab+x are arbitrary real numbers. By a straightforward computation 



J2 ( b ) |r^> y) = eab+iyp {^y) 2 + e ~ z ^ abV - eab+1 ") ' 

with F(0, t/) = 1. Then our approach can be applied and leads to the same type of results as Theorem 4.2 
with different Gx and G 2 ; the resulting expressions for the variance are more explicit and simpler than 
those given in [31]. 

4.3 Colless index: the differential path-length (DPL) 

The DPL of a tree is defined to be the sum over all nodes of the absolute difference of the two subtree-sizes 
of each node as depicted below. 

DPL = 

|7Teft — Trightl 

ht \ all nodes 

Properties of such a path length in random binary search trees have long been investigated in the 
systematic biology literature; see [4] and the references therein. 

Let X n denote the DPL of a random DST of n input- strings. Then by definition and by our indepen- 
dence assumption, we have the recurrence for the moment generating function 

M n+1 (y) = 2' n J2 M^M^iy^- 2 ^ (n > 0), (61) 

0<j<n 
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with M (y) = 1. 
Let also 

g 2 (z) := zf{{zf + z - h x {zf - zh' x {zf - Ah^z) ~f {z / 2) - 2zh' 1 (z)f[(z/2) + 4h c (z), 
where fi(z) is the Poisson generating function of E(X n ) and h c (z) is defined by 

hj(z) ■= e-'Yt^llr ^ ( n ^)nX k )\n-2k\. (62) 

n>0 ' 0<fc<n ^ ' 

Theorem 4.3. The mean and the variance of the DPL of random DSTs satisfy the asymptotic relations 

^2n 

i 

V(X 



E(X n ) = nP d ^\og 2 n) - ^ ^ - + 0(1), (63) 



= (l - ~ )n log 2 n + nP dj(T (log 2 n) + 0(n 1/2 ), (64) 

where )At and Prf jCT are explicitly computable, smooth, 1-periodic functions. 

These results are to be compared with the known results for random binary search trees for which the 
DPL has mean of order n \ogn and variance of order n 2 ; see [4]. 

Expected DPL. The approach we follow here for deriving the differential-functional equations satisfied 
by the Poisson generating functions of the first two moments is slightly different from the one we used since 
the corresponding nonlinear equation for the bivariate generating function F(z, y) := ^ n>0 M n (y)z n /n\ 
is very involved as given below. 

|-F(*,y)-l = FfeyVf^V 



dz v V 2 / V 2 

j_ / f(— y) ( F ( eVz /^y)- w ' le ' yF ( z /( 2w )^y) 

2ni J H=r>0 V 2 ' / V w-e-y 



F(e- y z/2,y) - w- l e v F{z/(2w),y) 
w — e y 



dw, 



vn±F(0,y) = 1. 

We use instead a more elementary argument. From the recurrence (61), we obtain, with fx n := E(X n ), 

0<k<n ^ ' 0<k<n ^ ' 

the initial condition being fx = 0. Then the Poisson generating function of X n satisfies the equation 

fx(z)+f' 1 (z) = 2f 1 (z/2)+h 1 (z), 
with /i(0) = 0, where hi is given by 



0<k<', 

ze~ z (I (z) + h(z)) 



n>0 0<k<n v 7 
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where we used the identity 

Efn\ , , . 2n\ 
, I n - 2k = , . ,„. . _ (n>l), 

< fe <„W L»/2j!(rn/2l-l)l " 

and J a (,z) denotes the modified Bessel functions 

v (;/2)^ 
° W ' 2- n !r(n + a+l)' 

n>0 ' 

It is known (see [63]) that, as \z\ — > oo, 



' (l + Od^l" 1 )), if|axg(*)| <7r/2-e, 



!„(*) = <( v 7 ^' ' , (65) 

O (|^|- 1/2 (e R W + e - ffi (*))) , if | arg(z)| < tt, 

the O-term holding uniformly in z in each case. Thus, by (65), hi E J?Sf and 

M*) = yf (l + o^r 1 )), 

for |z| — )• oo in | arg(,z)| < 7r/2 — e. Also 

Sf[h x - s ] = ( s + 2)- 1/2 s- 3/2 (tt(s)>0). 
Thus we can apply the same approach and deduce that 

1 n) io g 2£- r(2 + Xfc ) VSF(V2-i) 



where Gi(u) is the Mellin transform of J? [hi; s]/Q(—2s) 

f°o (.u-5/2 

Gi(u)= ————=== <ls (KM > 3/2). 
Jo Q(-2s)V-s + 2 

This proves (63). Numerically, the mean value of the dominant periodic function is G f i(2)/log2 
1.3390746494. 

The variance of DPL. Again from (61), we have the recurrence for the second moment s n := E(X 2 ) 
s n+ i = 2~ n ^ i^j (sk + s n -k + {n - 2k) 2 + 2fi k fi n _ k + Afi k \n - 2k\) , 

0<k<n ^ ' 

forn > 1 with s = si = 0. Since 



2 " n E (")(n-2fc) a = n, 



(66) 



0<fc<n 

we see that the Poisson generating function of s n satisfies the nonlinear equation 

Uz) + f 2 {z) = 2/ 2 (z/2) + 2A(z/2) 2 + z + A~h c (z), 
with /2(0) = 0, where h c (z) is defined in (62). 
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Lemma 4.4. The function h c is JS-admissible and satisfies 

h c (z) = h 1 (z)f 1 (z/2) + 0(\z\ 1 / 2 ), 

in the sector | arg(z)| < n/2 — e. 
Proof. Observe first that 



k>0 



k\ \2 

n>0 
1 (Z\ k 



\n — k\ (z\ n 
,2, 



ni 



- 2 E^ U E 



— j (z\i 



k\ V2 

fc>0 0<j<fe 
2 /■ e z(w+l/iu)/2 



2ni 



\w\=r 



{w-iy 



f. \2 
dw 



(r < 1), 



since 



E 

0<j<k 



k — j /^y' 

~7T 



we 



zw/2 



to 



{w-iy 



On the other hand, since fi(z) = J2 n >o i^n^/n!, we have, by the same argument, 



h c (z) := e z h c (z) 



Et(!)'T 



k>0 



k\ \2 



n>0 



\n — k\ i z 
.2 



-Et(ir(E 

fe>0 \0<n< 



k — n ( z\ n n — k f z\ n 

,9 +£ 



n! 



E£(S)"E 



fc>0 



k\ \2 



0<n<k 

k — n f z\ n 



n>k 



0<n<k 



nl 



n>0 



If ( z \ e wz/2 1 / 

27TZ J\ w \ =r<1 ^\2w) (w - l) 2 dW + 2ni Ju 



G)" + E a (f)"2> -*>$(£)■ 

0<fc<n 

r fwz\ e z/{2w) , 
H=r <i V 2 / (w - l) 2 



To prove condition (O), we start with changes of variables, giving 



h c (z) 



2m 



T\ w \=\z\ 1 \2J{w-z) 2 2mJ M= \ z \ 



w\ e z2 '^ 



2 J (w - zf 



dw, 



(67) 



where the first integration circle is indented to the right to avoid the polar singularity w — z, and the second 
to the left. By splitting each integration contour into two parts, we obtain 



e< <tt 



fl 



z\e 



■16 



e | 2 |(cos*)/2 d £ , 



where the integration contour =f is any path connecting the two endpoints |,2|e ±ie and indented to the right, 
and j denotes the corresponding symmetric contour with respect to w = z (and indented to the left). Since 
/i G condition (O) for h c (z) is readily checked. 
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For condition (I), it suffices to prove (67). For that purpose, we use the representation 

e~ Z f e z(w+l/w)/2 ( _ ( z . 



/ 7\ p~ z f p z(w+l/w)/2 —z (■ 

= h (f ) 7T- f ~ ( 7sr dw+ e —f e^"» 2 R z (w) dw 

\2J 2m J\ w \ =r<l (w-iy 2niJ lwl=r<1 

e 



where 



~h{zl2)h x {z) + —d> e< w+1 ^ 2 R z (w) dw, 
2 ™ J\ w \=i 



V 2 / J V2/ 1 \2/ 2 

is analytic at w = z. The error term of h c (z) — h 1 (z)fi(z/2) can be estimated by a similar argument as 
that used for checking condition (O). This completes the proof of the Lemma. □ 

The remaining analysis is now routine. Let V(z) := fafe) — fi(z) 2 — zf[(z) 2 . Then 

V{z)+V'{z) = 2V{z/2)+g 2 (z), 

where, by Lemma 2.7, 

g 2 (z) = z - h^zf + 4 (h c (z) - hi(z)h(z/2)) - z\{zf - 2zh[(z)f[(z/2) + zf>(z) 2 

i-^* + o(M 1/2 ), 

for | arg(z)| < n/2 — e. From this and the analytic properties of the functions involved, we deduce (64). 

Remark. The same approach can be extended to more general differential path-length of the form 
Sail nodes l^ieft ~~ 7ri g ht| m with m > 2. Interestingly, when m — 2, the mean is identical to the total in- 
ternal path-length in view of (66) and the variance is asymptotic to An 2 . For m > 2, the mean and the 
variance are asymptotic to 

2m /2 r((m+1)/2) ^ m/2 2 m (r(m + 1/2) _ 7r -l/2 r( ( m + l)/ 2 )2) 

0r(l - 2 l ~ m ) ' ^(1 - 2 l ~ m ) 

respectively. 

4.4 A weighted path-length (WPL) 

Weighted path-lengths of the form W n := ^ 1<J<n Wjij appear often in applications, where £j denotes the 
distance of the j-th node (arranged in an appropriate manner, say first level-wise and then left-to-right or 



51 



in their incoming order) to the root and Wj the weight attached to the j-th node. The calculation of W n in 
the case of random DSTs can be carried out recursively by 

w n+1 i w Bn + w:_ Bn + Yl w v 

2<J<n+l 

assuming that the root is labelled 1. We consider in this section the case when Wj = (log j) m , m > 1. 
From a technical point of view, it suffices to consider the random variables 

X n+1 ± X Bn + X* n _ Bn + (n + l)(log(n + l)) m (n > 0), 
with Xq = 0, since the partial sum 2~^2<j<n0°Si) m * s nothing but 

2<J<n 

where 

La, m (z) := n ~ a ( l °Z k ^ zm (a 7^ 1, 2, ... ), 
fc>i 

on whose analytic properties our analytic approach heavily relies. 

The random variables X n represent the sole example on DSTs we discuss in this paper with non- 
integral values; they also exhibit an interesting phenomenon in that the mean is of order n(logn) m+1 but 
the variance is asymptotic to n times a periodic function, in contrast to the orders of DPL. 

Theorem 4.5. The mean and the variance of the weighted path-length X n are asymptotic to 

i\ \m-\-l 

E(X n ) = P " gnj + n c m ,(logny+nP w ^log 2 n) + 0{(\ogn) m+1 ), 

(m+l)log2 i *-? m 

V(X n ) = nP w ^{\og 2 n) + O ((logn) 2m+2 ) , 

respectively, where the c m j's are constants depending on m, and P Wtfl and P w>a are 1-periodic, smooth 
functions. 

That the variance is linear is well-predicted by the deep theorem of Schachinger derived in [58] since 
the second difference of the sequence n(logn) m is o(n _1 / 2_e ). Our approach has the advantage of provid- 
ing more precise approximations. 

The new ingredient we need is incorporated in the following lemma. 

Lemma 4.6 ([21]). The function L aym (z) can analytically be continued into the cut-plane C \ [1, oo) with 
a sole singularity at z = 1 near which it admits the asymptotic approximation 

L a ,m(e~ S ) = r(l - a)s a -\- \ogs) m + 0(1), 

the O-term holding uniformly for | arg(s)| < 7r — e. 

Indeed, the tools developed in [21] can also be easily extended to similar "toll-functions" such as nH™. 
Details are left for the interested readers. 
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5 Conclusions and extensions 



We showed in this paper, through many shape parameters on random DSTs that the crucial use of the 
normalization V(z) : = fz{z) — fi(z) 2 — zf'(z) 2 at the level of Poisson generating function is extremely 
helpful in simplifying the asymptotic analysis of the variance as well as the resulting expressions. The 
same idea can be applied to a large number of concrete problems with a binomial splitting procedure. 
These and some related topics and extensions will be pursued elsewhere. We briefly mention in this final 
section some extensions and related properties. 



Central limit theorems. All shape parameters we considered in this paper are asymptotically normally 
distributed in the sense of convergence in distribution. We describe the results in this section and merely 
indicate the methods of proofs. The only case that requires a separate study is NPL of random 6-DSTs 
with b > 2 (a bivariate consideration of the limit laws is needed), details being given in a future paper. 

Theorem 5.1. The internal path-length, the peripheral path-length, the number of leaves, the differential 
path-length, the weighted path-length of random DSTs, and the key-wise path-length of random b-DSTs 
with b > 2 are all asymptotically normally distributed 



X n — E(X„ 



^^(0,1), 



where X n denotes any of these shape parameters, stands for convergence in distribution, and ^(0, 1) 
is a standard normal distribution with zero mean and unit variance. 

See Figure 10 for a plot of the histograms of DPL. 

The method of moments applies to all these cases and establishes the central limit theorems; similar 
details are given as in [31] (the asymptotic normality of the number of leaves being already proved there 
as a special case). 

In a parallel way, contraction method also works well for all these shape parameters; see [51, 52, 53]. 

On the other hand, Schachinger's asymptotic normality results cover the IPL, PPL, number of leaves 
and WPL, but not PPL and KPL on 6-DSTs, although his approach may be modified for that purpose. 

Finally, the complex- analytic approach used in [35] for internal path-length may be extended to prove 
some of these cases, but the proofs are messy, although the results established are often stronger (for 
example, with convergence rate). 



The depth. The asymptotic analysis we used in this paper can also be extended to the depth (the distance 
between a randomly chosen internal node and the root) although it is of logarithmic order. Let X n denote 
the depth of a random DST of n nodes. The starting point is to consider the expected profile polynomial 

P n {y) := nF ( X n = k)y k , 

0<k<n 

where n¥(X n = k) is nothing but the expected number of internal nodes at distance k to the root. Then 
we have the recurrence 

P n+1 {y) = 1 + y2- n (fj ( P ^) + p n-k(y)) (n > 0), 
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n = 20 

n = 30 

n = 50 




Figure 10: 77ze histograms ofDPLfor n = 20, 30, 40 and 50, normalized by their standard deviations. 
with Po(y) — 0. From this relation, we obtain the equation for the Poisson generating function F(z, y) of 

Pn(y) 

F(z, y) + ^-F(z, y) = 2yF ( J, y) + 1, 



dz v '"" \2' 

with F(Q,y) = 0. It follows, by taking coefficients of z n on both sides and by solving the resulting 
recurrence, that 

l</c<n V 7 0<i<fc-2 

see [44, p. 504] for a different proof. Asymptotic approximation to P n (y) can then be obtained by Pace's 
integral formula 

y-1 f T(n + l)T(-s)Q(y) 



P n (y) = n 



2ni 



(3/2) 



T(n + l-8)(l-2 l -'y)Q(2 1 -»y) 



ds, 



for \y — 1 1 < e. More precisely, if t £ C lies in a small neighborhood of the origin, then 

PJe 1 ) 



E e 



n 



(e l - l)Q(e f 
Q(l)log2 



Er -i 



t 



log 2 



(68) 



uniformly for \t\ < e. Alternatively, one can also apply the Laplace-Mellin-de-Poissonization approach 
and obtain the same type of result for not only DSTs but also for more general 6-DSTs. See [48, 49] for a 
more general and detailed treatment (by a different approach). 

The estimate (68) leads to effective asymptotic estimates for all moments of X n — log 2 n by standard 
arguments; see [32]. In particular, we obtain 

E(X n ) = log 2 n + + ^ - E 2 k - 1 + Wl ^ og2 n ) +0 lo § n ) > 



k>l 



V(A '"» = T2 + (i3i2P + t) - £ ps^ijs + »»0»& ») + (»"' lo « 2 ») • 

where the estimate for the mean is exactly (7) with w\ given in (8) and w§ is a smooth periodic function. 
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An analytic extension. From a purely analytic viewpoint, the underlying differential-functional equation 
(13) for the moments can be extended to an equation of the form 

E ( 6 )/ W (*) = «/(4)+S(*) («>0;/3>l), 

for which our approach still applies, leading to the functional equation for the Laplace transform 

(s + l) 6 J*f [/; s] = a/3J?[f; Ps] + J?[g; s). 
The natural normalizing function is then provided by 

«,(-.) =n(i + ±y, 

and the corresponding Laplace-Mellin asymptotic analysis is similar. 

In particular, the case when a = (3 = m corresponds to a straightforward extension of binary DSTs 
to m-ary DSTS (and the binary unbiased Bernoulli random variable to the uniform distribution over 
{0, 1, . . . , m — 1}). The stochastic behaviors of all shape parameters on such trees follow the same patterns 
as showed in this paper. 

Yet another concrete instance arises in the so-called Eden model studied by Dean and Majumdar [10], 
which corresponds to a = m and j3 > 1. The model is constructed in the following way. We start at time 
t = at which we have an empty node. Then at time t = T, where T ~ Exponential(l), we fill the empty 
node and attach to it m different empty nodes. The process then continues independently for each empty 
node by the following recursive rule. Once an empty node of depth j is attached to a tree at time t = t', it 
is then filled at time point t' + T, where T ~ E{^), and m new empty nodes are attached to it. 

The mean and the variance of the number of filled nodes at a large time of such trees are studied in 
details in [10]. Since the model is continuous, there is no need to de-Poissonize to derive the asymptotics of 
the coefficient; as a consequence, no correction term as we used in this paper is required for the asymptotics 
of the variance. 

Other DST-type recurrences. While the technique of Poissonized variance with correction remains 
useful for the natural case when the Bernoulli random variable is no longer symmetric, the Laplace-Mellin 
approach does not apply directly. Other asymptotic ingredients are needed such as a direct manipulation 
of the Mellin transforms; see [49] and the references therein. 

DST-type structures and recurrences also arise in other statistical physical models such as the diffusion- 
limited aggregation; see [1, 5]. 
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Appendix. An Elementary Approach to the Asymptotic Linearity of 
the Variance. 

We describe briefly here a direct elementary approach to the variance of random variables satisfying the 
recurrence 

X n+ i = X Bn + X*_ Bn + T n , 
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where 



77,,.,:. FIB,, -Li ■ ( " k )l (0 </,'<//). 



The starting point is to consider the recurrence satisfied by the variance v n := Y(X n ) 

v n +i= ^2 n n,k(v k + v n „ k ) + U n + Y(T n ), 

0<k<n 

where [i k := E(X n ) and 

u n := 2^ ^fc (/ifc + /in-fc - /in+i + E(T n )) 2 . 

0<fc<n 

In most cases, we have the estimate fi k = fi(k) + 0{k £ ). This, together with the Gaussian approximation 
of the binomial distribution, implies that 



u, 



£ Vn* (A (f + |Vn) + A Q - |Vn) - A(n + 1) + E(T n 

|fc-n/2|=o(n 2 / 3 ) 
fc=n/2+a; v /n/2 

£ 7r n , fe (2A (|) - A(n) - A(n) + E(T„) X ' 

|fc-n/2|=o(n 2 / 3 ) 
k=n/2+Xs/n/2 



ViQ)-A(n)-A(n)+E(T n )) . 
But then (see (13) below) 

2 A (|) " AW - AH + E(T n ) = E(T n ) - hin), 



where 



j>o J 



The order of the difference E(T n ) — h\(n) ~ n|ft.i(n)| are expected to be small, roughly 0(n £ ) in all 
cases we consider here. Consequently, the variance is asymptotically linear; see [31, 58] for more precise 
details. 

We see clearly that the smallness of the variance results naturally from the high concentration of the 
binomial distribution near its mean. 
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